当 AWS Elasticache Redis 集群向外扩展时，Terraform 计划失败

问题描述

在 AWS 中使用 Terraform 创建了一个 redis 集群（启用了集群模式）；每当集群扩展时，所有 terraform 计划和应用操作都会失败。

这是一个问题，因为当 Redis 集群自动扩展时，无法从 terraform 更改 AWS 账户中的任何其他资源。鉴于横向扩展需要 10 分钟左右的时间，这可能成为维护环境其他组件的普遍问题。

我不需要 terraform 来管理集群实例的状态，只需要它来初始引导集群。集群的管理和扩展将由 Lambda 资源管理。

我希望在调整集群大小时，我应该能够在其他 AWS 资源上执行 terraform 计划和应用 terraform 中的操作。

我在观察什么

在 AWS UI 中扩展 Redis 集群时（添加分片或节点到分片），现有节点状态更改为“正在修改”，在此期间，任何 terraform 计划或应用操作都会失败：

Error: error listing tags for resource (arn:aws:elasticache:ap-southeast-2::cluster:router-redis-cache-0001-001): CacheClusterNotFound: router-redis-cache-0001-001 is either not present or not available.
        status code: 404,request id: b6cfcff3-dfa7-41cf-b099-0eb0c9767990

重现步骤

在 main.tf 中添加以下内容：

# Configure your AWS Provider
provider "aws" {
  region = "us-east-1"
}

resource "aws_elasticache_replication_group" "this" {
  count = 1

  at_rest_encryption_enabled    = true
  multi_az_enabled              = true
  automatic_failover_enabled    = true
  replication_group_id          = "users-cache"
  replication_group_description = "Users Redis cache"
  node_type                     = "cache.t3.medium"
  parameter_group_name          = "default.redis6.x.cluster.on"
  port                          = 6379

  cluster_mode {
    num_node_groups         = 1 # Number of initial shards
    replicas_per_node_group = 1 # Number of initial replicas within each shard
  }

  apply_immediately = true

  lifecycle {
    ignore_changes = [
      # Scaling the instances in AWS will change cluster_mode.num_node_groups,custer_mode.replicas_per_node_group;
      # disregard drift from initial configuration.
      cluster_mode,]
  }
}

地形应用
等待操作完成
登录 AWS 账户
找到 Elasticache 集群
扩展集群（例如点击“添加分片”）
注意集群进入“修改”状态
运行terraform plan
观察失败

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

amazon-elasticache amazon-web-services redis terraform