Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deleting an old node group is attempting to overwrite a newly created node group #2001

Closed
kaykhancheckpoint opened this issue Apr 9, 2022 · 12 comments

Comments

@kaykhancheckpoint
Copy link

kaykhancheckpoint commented Apr 9, 2022

Description

Originally i had a node group called ops-node with instance of type m5.xlarge. I have created a new node group called op-node with instance of type m5.2xlarge and migrated the apps from the original node group over to this new one because they required more resources.

Now i want to clean up the original/old node group ops-node. When i attempt to delete the old node group configuration (removing the entry {} in the [], it also wants to delete the newly created node group.

Versions

  • Module version [Required]: 15.1.0

  • Terraform version:
    Terraform v1.0.6

  • Provider version(s):
+ provider registry.terraform.io/hashicorp/aws v3.51.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/kubernetes v2.3.2
+ provider registry.terraform.io/hashicorp/local v2.0.0
+ provider registry.terraform.io/hashicorp/null v3.0.0
+ provider registry.terraform.io/hashicorp/random v3.0.0
+ provider registry.terraform.io/hashicorp/template v2.2.0
+ provider registry.terraform.io/terraform-aws-modules/http v2.4.1

Reproduction Code [Required]

initially

worker_groups = [
    {
      name                          = "worker-nodes"
      instance_type                 = "m5.xlarge"
      ami_id                        = "ami-0ad418be69ef09deb"
      additional_userdata           = ""
      asg_min_size                  = 2
      asg_desired_capacity          = 2
      asg_max_size                  = 5
      additional_security_group_ids = [data.terraform_remote_state.networking.outputs.eks_worker_nodes_sg_id]
      kubelet_extra_args            = "--node-labels=geeiq/node-type=worker"
      public_ip                     = true
      key_name                      = data.terraform_remote_state.networking.outputs.geeiq_key_pair_key_name
      tags                          = local.worker_group_tags
    },
    {
      name                          = "cron-nodes"
      instance_type                 = "m5.xlarge"
      ami_id                        = "ami-0ad418be69ef09deb"
      additional_userdata           = ""
      asg_min_size                  = 1
      asg_desired_capacity          = 1
      asg_max_size                  = 5
      additional_security_group_ids = [data.terraform_remote_state.networking.outputs.eks_worker_nodes_sg_id]
      kubelet_extra_args            = "--node-labels=geeiq/node-type=cron"
      public_ip                     = true
      key_name                      = data.terraform_remote_state.networking.outputs.geeiq_key_pair_key_name
      tags                          = local.worker_group_tags
    },
    {
      name                          = "ops-nodes"
      instance_type                 = "m5.xlarge"
      ami_id                        = "ami-0ad418be69ef09deb"
      subnets                       = ["subnet-0a6500f985213bd36", "subnet-09c52093f9e755207"]
      additional_userdata           = ""
      asg_min_size                  = 0
      asg_desired_capacity          = 0
      asg_max_size                  = 5
      additional_security_group_ids = [data.terraform_remote_state.networking.outputs.eks_worker_nodes_sg_id]
      kubelet_extra_args            = "--node-labels=geeiq/node-type=ops"
      public_ip                     = true
      key_name                      = data.terraform_remote_state.networking.outputs.geeiq_key_pair_key_name
      tags                          = local.worker_group_tags
    },
  ]

after

worker_groups = [
    {
      name                          = "worker-nodes"
      instance_type                 = "m5.xlarge"
      ami_id                        = "ami-0ad418be69ef09deb"
      additional_userdata           = ""
      asg_min_size                  = 2
      asg_desired_capacity          = 2
      asg_max_size                  = 5
      additional_security_group_ids = [data.terraform_remote_state.networking.outputs.eks_worker_nodes_sg_id]
      kubelet_extra_args            = "--node-labels=geeiq/node-type=worker"
      public_ip                     = true
      key_name                      = data.terraform_remote_state.networking.outputs.geeiq_key_pair_key_name
      tags                          = local.worker_group_tags
    },
    {
      name                          = "cron-nodes"
      instance_type                 = "m5.xlarge"
      ami_id                        = "ami-0ad418be69ef09deb"
      additional_userdata           = ""
      asg_min_size                  = 1
      asg_desired_capacity          = 1
      asg_max_size                  = 5
      additional_security_group_ids = [data.terraform_remote_state.networking.outputs.eks_worker_nodes_sg_id]
      kubelet_extra_args            = "--node-labels=geeiq/node-type=cron"
      public_ip                     = true
      key_name                      = data.terraform_remote_state.networking.outputs.geeiq_key_pair_key_name
      tags                          = local.worker_group_tags
    },
    {
      name                          = "ops-nodes"
      instance_type                 = "m5.xlarge"
      ami_id                        = "ami-0ad418be69ef09deb"
      subnets                       = ["subnet-0a6500f985213bd36", "subnet-09c52093f9e755207"]
      additional_userdata           = ""
      asg_min_size                  = 0
      asg_desired_capacity          = 0
      asg_max_size                  = 5
      additional_security_group_ids = [data.terraform_remote_state.networking.outputs.eks_worker_nodes_sg_id]
      kubelet_extra_args            = "--node-labels=geeiq/node-type=ops"
      public_ip                     = true
      key_name                      = data.terraform_remote_state.networking.outputs.geeiq_key_pair_key_name
      tags                          = local.worker_group_tags
    },
    {
      name                          = "op-nodes"
      instance_type                 = "m5.2xlarge"
      ami_id                        = "ami-0ad418be69ef09deb"
      subnets                       = ["subnet-0a6500f985213bd36", "subnet-09c52093f9e755207"]
      additional_userdata           = ""
      asg_min_size                  = 2
      asg_desired_capacity          = 2
      asg_max_size                  = 5
      additional_security_group_ids = [data.terraform_remote_state.networking.outputs.eks_worker_nodes_sg_id]
      kubelet_extra_args            = "--node-labels=geeiq/node-type=ops"
      public_ip                     = true
      key_name                      = data.terraform_remote_state.networking.outputs.geeiq_key_pair_key_name
      tags                          = local.worker_group_tags
      root_volume_size              = 250
    },

  ]

Steps to reproduce the behavior:

  1. Create a new node group called op-node with almost same configuration except different size instance type and root volume size.
  2. Delete the first node group {} called ops-node from the terrafom config and then do terraform plan/apply.

Expected behavior

I expect just the first node group ops-node to be deleted.

Actual behavior

Its attempting to delete the first one and recreate the newly created one.

Terminal Output Screenshot(s)

module.eks.aws_cloudwatch_log_group.this[0]: Refreshing state... [id=/aws/eks/geeiq-prod-k8s/cluster]
aws_iam_policy.AllowExternalDNSUpdatesPolicy: Refreshing state... [id=arn:aws:iam::700849607999:policy/AllowExternalDNSUpdates]
module.eks.aws_security_group.workers[0]: Refreshing state... [id=sg-0efffe6c4e9cef238]
module.eks.aws_security_group.cluster[0]: Refreshing state... [id=sg-09c2c8f7897d7b45c]
aws_iam_policy.AWSLoadBalancerControllerIAMPolicy: Refreshing state... [id=arn:aws:iam::700849607999:policy/AWSLoadBalancerControllerIAMPolicy]
module.eks.aws_iam_role.cluster[0]: Refreshing state... [id=geeiq-prod-k8s20210422135910703300000004]
module.eks.aws_iam_policy.cluster_elb_sl_role_creation[0]: Refreshing state... [id=arn:aws:iam::700849607999:policy/geeiq-prod-k8s-elb-sl-role-creation20210422135910699800000001]
module.eks.aws_security_group_rule.workers_ingress_self[0]: Refreshing state... [id=sgrule-464871074]
module.eks.aws_security_group_rule.workers_egress_internet[0]: Refreshing state... [id=sgrule-2996021003]
module.eks.aws_security_group_rule.cluster_egress_internet[0]: Refreshing state... [id=sgrule-375825644]
module.eks.aws_security_group_rule.workers_ingress_cluster[0]: Refreshing state... [id=sgrule-858513986]
module.eks.aws_security_group_rule.cluster_https_worker_ingress[0]: Refreshing state... [id=sgrule-2878133799]
module.eks.aws_security_group_rule.workers_ingress_cluster_https[0]: Refreshing state... [id=sgrule-428829599]
module.eks.aws_iam_role_policy_attachment.cluster_AmazonEKSVPCResourceControllerPolicy[0]: Refreshing state... [id=geeiq-prod-k8s20210422135910703300000004-20210422135912919700000005]
module.eks.aws_iam_role_policy_attachment.cluster_AmazonEKSClusterPolicy[0]: Refreshing state... [id=geeiq-prod-k8s20210422135910703300000004-20210422135913041500000008]
module.eks.aws_iam_role_policy_attachment.cluster_elb_sl_role_creation[0]: Refreshing state... [id=geeiq-prod-k8s20210422135910703300000004-20210422135912997800000007]
module.eks.aws_iam_role_policy_attachment.cluster_AmazonEKSServicePolicy[0]: Refreshing state... [id=geeiq-prod-k8s20210422135910703300000004-20210422135912962800000006]
module.eks.aws_eks_cluster.this[0]: Refreshing state... [id=geeiq-prod-k8s]
module.eks.aws_iam_role.workers[0]: Refreshing state... [id=geeiq-prod-k8s20210422140902992300000009]
module.eks.aws_iam_openid_connect_provider.oidc_provider[0]: Refreshing state... [id=arn:aws:iam::700849607999:oidc-provider/oidc.eks.us-east-2.amazonaws.com/id/021E81F3181F90B9E08047402B53F6C4]
module.eks.null_resource.wait_for_cluster[0]: Refreshing state... [id=3778844967681321999]
module.eks.local_file.kubeconfig[0]: Refreshing state... [id=c2fc99116863d4aeac90ce009ff4608b087be347]
module.iam_assumable_role_aws_lb.aws_iam_role.this[0]: Refreshing state... [id=AWSLoadBalancerControllerIAMRole]
module.iam_assumable_role_aws_external_dns.aws_iam_role.this[0]: Refreshing state... [id=AllowExternalDNSUpdatesRole]
module.iam_assumable_role_admin.aws_iam_role.this[0]: Refreshing state... [id=cluster-autoscaler]
aws_iam_policy.cluster_autoscaler: Refreshing state... [id=arn:aws:iam::700849607999:policy/cluster-autoscaler2021042214090346520000000a]
kubernetes_secret.aws-secret: Refreshing state... [id=default/aws-secret]
module.eks.aws_iam_role_policy_attachment.workers_AmazonEC2ContainerRegistryReadOnly[0]: Refreshing state... [id=geeiq-prod-k8s20210422140902992300000009-2021042214090525440000000f]
module.eks.aws_iam_role_policy_attachment.workers_AmazonEKS_CNI_Policy[0]: Refreshing state... [id=geeiq-prod-k8s20210422140902992300000009-20210422140905259400000011]
module.eks.aws_iam_role_policy_attachment.workers_AmazonEKSWorkerNodePolicy[0]: Refreshing state... [id=geeiq-prod-k8s20210422140902992300000009-20210422140905254400000010]
module.eks.aws_iam_instance_profile.workers[3]: Refreshing state... [id=geeiq-prod-k8s20220409095303658700000001]
module.eks.aws_iam_instance_profile.workers[1]: Refreshing state... [id=geeiq-prod-k8s20210422145122221300000002]
module.eks.aws_iam_instance_profile.workers[2]: Refreshing state... [id=geeiq-prod-k8s20210422145122221200000001]
module.eks.aws_iam_instance_profile.workers[0]: Refreshing state... [id=geeiq-prod-k8s2021042214090480140000000b]
module.iam_assumable_role_aws_lb.aws_iam_role_policy_attachment.custom[0]: Refreshing state... [id=AWSLoadBalancerControllerIAMRole-2021042214090525090000000e]
module.iam_assumable_role_aws_external_dns.aws_iam_role_policy_attachment.custom[0]: Refreshing state... [id=AllowExternalDNSUpdatesRole-2021042214090520220000000c]
module.iam_assumable_role_admin.aws_iam_role_policy_attachment.custom[0]: Refreshing state... [id=cluster-autoscaler-2021042214090523380000000d]
module.eks.aws_launch_configuration.workers[0]: Refreshing state... [id=geeiq-prod-k8s-worker-nodes20220409085841756000000002]
module.eks.aws_launch_configuration.workers[2]: Refreshing state... [id=geeiq-prod-k8s-ops-nodes20220409085841756100000003]
module.eks.aws_launch_configuration.workers[1]: Refreshing state... [id=geeiq-prod-k8s-cron-nodes20220409085841755200000001]
module.eks.aws_launch_configuration.workers[3]: Refreshing state... [id=geeiq-prod-k8s-op-nodes20220409095305101600000002]
module.eks.kubernetes_config_map.aws_auth[0]: Refreshing state... [id=kube-system/aws-auth]
module.eks.random_pet.workers[1]: Refreshing state... [id=liked-jackal]
module.eks.random_pet.workers[3]: Refreshing state... [id=vocal-monitor]
module.eks.random_pet.workers[0]: Refreshing state... [id=pleased-sheep]
module.eks.random_pet.workers[2]: Refreshing state... [id=proud-worm]
module.eks.aws_autoscaling_group.workers[3]: Refreshing state... [id=geeiq-prod-k8s-op-nodes20220409095313235600000003]
module.eks.aws_autoscaling_group.workers[0]: Refreshing state... [id=geeiq-prod-k8s-worker-nodes20210422140918410300000013]
module.eks.aws_autoscaling_group.workers[2]: Refreshing state... [id=geeiq-prod-k8s-ops-nodes20210422145135744400000006]
module.eks.aws_autoscaling_group.workers[1]: Refreshing state... [id=geeiq-prod-k8s-cron-nodes20210422145135743200000005]


Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  - destroy
+/- create replacement and then destroy

Terraform will perform the following actions:

  # module.eks.aws_autoscaling_group.workers[2] must be replaced
+/- resource "aws_autoscaling_group" "workers" {
      ~ arn                       = "arn:aws:autoscaling:us-east-2:700849607999:autoScalingGroup:a2f02edf-2494-4e3a-ae6f-b79ac503e27e:autoScalingGroupName/geeiq-prod-k8s-ops-nodes20210422145135744400000006" -> (known after apply)
      ~ availability_zones        = [
          - "us-east-2a",
        ] -> (known after apply)
      - capacity_rebalance        = false -> null
      ~ default_cooldown          = 300 -> (known after apply)
      - enabled_metrics           = [] -> null
      ~ health_check_type         = "EC2" -> (known after apply)
      ~ id                        = "geeiq-prod-k8s-ops-nodes20210422145135744400000006" -> (known after apply)
      ~ launch_configuration      = "geeiq-prod-k8s-ops-nodes20220409085841756100000003" -> (known after apply)
      - load_balancers            = [] -> null
      ~ name                      = "geeiq-prod-k8s-ops-nodes20210422145135744400000006" -> (known after apply)
      ~ name_prefix               = "geeiq-prod-k8s-ops-nodes" -> "geeiq-prod-k8s-op-nodes" # forces replacement
      ~ service_linked_role_arn   = "arn:aws:iam::700849607999:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling" -> (known after apply)
      - target_group_arns         = [] -> null
        # (13 unchanged attributes hidden)

      + tag {
          + key                 = "Name"
          + propagate_at_launch = true
          + value               = "geeiq-prod-k8s-op-nodes-eks_asg"
        }
      - tag {
          - key                 = "Name" -> null
          - propagate_at_launch = true -> null
          - value               = "geeiq-prod-k8s-ops-nodes-eks_asg" -> null
        }
        # (8 unchanged blocks hidden)
    }

  # module.eks.aws_autoscaling_group.workers[3] will be destroyed
  - resource "aws_autoscaling_group" "workers" {
      - arn                       = "arn:aws:autoscaling:us-east-2:700849607999:autoScalingGroup:030d9214-2f4b-4c58-9367-1a59de288e3b:autoScalingGroupName/geeiq-prod-k8s-op-nodes20220409095313235600000003" -> null
      - availability_zones        = [
          - "us-east-2a",
        ] -> null
      - capacity_rebalance        = false -> null
      - default_cooldown          = 300 -> null
      - desired_capacity          = 2 -> null
      - enabled_metrics           = [] -> null
      - force_delete              = false -> null
      - force_delete_warm_pool    = false -> null
      - health_check_grace_period = 300 -> null
      - health_check_type         = "EC2" -> null
      - id                        = "geeiq-prod-k8s-op-nodes20220409095313235600000003" -> null
      - launch_configuration      = "geeiq-prod-k8s-op-nodes20220409095305101600000002" -> null
      - load_balancers            = [] -> null
      - max_instance_lifetime     = 0 -> null
      - max_size                  = 5 -> null
      - metrics_granularity       = "1Minute" -> null
      - min_size                  = 2 -> null
      - name                      = "geeiq-prod-k8s-op-nodes20220409095313235600000003" -> null
      - name_prefix               = "geeiq-prod-k8s-op-nodes" -> null
      - protect_from_scale_in     = false -> null
      - service_linked_role_arn   = "arn:aws:iam::700849607999:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling" -> null
      - suspended_processes       = [
          - "AZRebalance",
        ] -> null
      - target_group_arns         = [] -> null
      - termination_policies      = [] -> null
      - vpc_zone_identifier       = [
          - "subnet-09c52093f9e755207",
          - "subnet-0a6500f985213bd36",
        ] -> null
      - wait_for_capacity_timeout = "10m" -> null

      - tag {
          - key                 = "Environment" -> null
          - propagate_at_launch = true -> null
          - value               = "prod" -> null
        }
      - tag {
          - key                 = "GithubOrg" -> null
          - propagate_at_launch = true -> null
          - value               = "terraform-aws-modules" -> null
        }
      - tag {
          - key                 = "GithubRepo" -> null
          - propagate_at_launch = true -> null
          - value               = "terraform-aws-eks" -> null
        }
      - tag {
          - key                 = "Name" -> null
          - propagate_at_launch = true -> null
          - value               = "geeiq-prod-k8s-op-nodes-eks_asg" -> null
        }
      - tag {
          - key                 = "Terraform" -> null
          - propagate_at_launch = true -> null
          - value               = "true" -> null
        }
      - tag {
          - key                 = "k8s.io/cluster-autoscaler/enabled" -> null
          - propagate_at_launch = false -> null
          - value               = "true" -> null
        }
      - tag {
          - key                 = "k8s.io/cluster-autoscaler/geeiq-prod-k8s" -> null
          - propagate_at_launch = false -> null
          - value               = "true" -> null
        }
      - tag {
          - key                 = "k8s.io/cluster/geeiq-prod-k8s" -> null
          - propagate_at_launch = true -> null
          - value               = "owned" -> null
        }
      - tag {
          - key                 = "kubernetes.io/cluster/geeiq-prod-k8s" -> null
          - propagate_at_launch = true -> null
          - value               = "owned" -> null
        }
    }

  # module.eks.aws_iam_instance_profile.workers[3] will be destroyed
  - resource "aws_iam_instance_profile" "workers" {
      - arn         = "arn:aws:iam::700849607999:instance-profile/geeiq-prod-k8s20220409095303658700000001" -> null
      - create_date = "2022-04-09T09:53:04Z" -> null
      - id          = "geeiq-prod-k8s20220409095303658700000001" -> null
      - name        = "geeiq-prod-k8s20220409095303658700000001" -> null
      - name_prefix = "geeiq-prod-k8s" -> null
      - path        = "/" -> null
      - role        = "geeiq-prod-k8s20210422140902992300000009" -> null
      - tags        = {} -> null
      - tags_all    = {} -> null
      - unique_id   = "AIPA2GLPEKU7UFU7KHTG5" -> null
    }

  # module.eks.aws_launch_configuration.workers[2] must be replaced
+/- resource "aws_launch_configuration" "workers" {
      ~ arn                              = "arn:aws:autoscaling:us-east-2:700849607999:launchConfiguration:72db05bc-0da6-4d23-bfe8-adc4cfc396c6:launchConfigurationName/geeiq-prod-k8s-ops-nodes20220409085841756100000003" -> (known after apply)
      ~ id                               = "geeiq-prod-k8s-ops-nodes20220409085841756100000003" -> (known after apply)
      ~ instance_type                    = "m5.xlarge" -> "m5.2xlarge" # forces replacement
      ~ name                             = "geeiq-prod-k8s-ops-nodes20220409085841756100000003" -> (known after apply)
      ~ name_prefix                      = "geeiq-prod-k8s-ops-nodes" -> "geeiq-prod-k8s-op-nodes" # forces replacement
      - vpc_classic_link_security_groups = [] -> null
        # (8 unchanged attributes hidden)

      + ebs_block_device {
          + delete_on_termination = (known after apply)
          + device_name           = (known after apply)
          + encrypted             = (known after apply)
          + iops                  = (known after apply)
          + no_device             = (known after apply)
          + snapshot_id           = (known after apply)
          + throughput            = (known after apply)
          + volume_size           = (known after apply)
          + volume_type           = (known after apply)
        }

      + metadata_options {
          + http_endpoint               = (known after apply)
          + http_put_response_hop_limit = (known after apply)
          + http_tokens                 = (known after apply)
        }

      ~ root_block_device {
          ~ throughput            = 0 -> (known after apply)
          ~ volume_size           = 100 -> 250 # forces replacement
            # (4 unchanged attributes hidden)
        }
    }

  # module.eks.aws_launch_configuration.workers[3] will be destroyed
  - resource "aws_launch_configuration" "workers" {
      - arn                              = "arn:aws:autoscaling:us-east-2:700849607999:launchConfiguration:b1013d4b-a616-4d13-9aab-af83abcf89df:launchConfigurationName/geeiq-prod-k8s-op-nodes20220409095305101600000002" -> null
      - associate_public_ip_address      = true -> null
      - ebs_optimized                    = true -> null
      - enable_monitoring                = true -> null
      - iam_instance_profile             = "geeiq-prod-k8s20220409095303658700000001" -> null
      - id                               = "geeiq-prod-k8s-op-nodes20220409095305101600000002" -> null
      - image_id                         = "ami-0ad418be69ef09deb" -> null
      - instance_type                    = "m5.2xlarge" -> null
      - key_name                         = "geeiq-prod-key" -> null
      - name                             = "geeiq-prod-k8s-op-nodes20220409095305101600000002" -> null
      - name_prefix                      = "geeiq-prod-k8s-op-nodes" -> null
      - security_groups                  = [
          - "sg-0efffe6c4e9cef238",
          - "sg-0f8e379b67348c6fe",
        ] -> null
      - user_data_base64                 = "IyEvYmluL2Jhc2ggLWUKCiMgQWxsb3cgdXNlciBzdXBwbGllZCBwcmUgdXNlcmRhdGEgY29kZQoKCiMgQm9vdHN0cmFwIGFuZCBqb2luIHRoZSBjbHVzdGVyCi9ldGMvZWtzL2Jvb3RzdHJhcC5zaCAtLWI2NC1jbHVzdGVyLWNhICdMUzB0TFMxQ1JVZEpUaUJEUlZKVVNVWkpRMEZVUlMwdExTMHRDazFKU1VNMWVrTkRRV01yWjBGM1NVSkJaMGxDUVVSQlRrSm5hM0ZvYTJsSE9YY3dRa0ZSYzBaQlJFRldUVkpOZDBWUldVUldVVkZFUlhkd2NtUlhTbXdLWTIwMWJHUkhWbnBOUWpSWVJGUkplRTFFVVhsTmFrVXdUVVJSTVU1R2IxaEVWRTE0VFVSUmVVMUVSVEJOUkZFeFRrWnZkMFpVUlZSTlFrVkhRVEZWUlFwQmVFMUxZVE5XYVZwWVNuVmFXRkpzWTNwRFEwRlRTWGRFVVZsS1MyOWFTV2gyWTA1QlVVVkNRbEZCUkdkblJWQkJSRU5EUVZGdlEyZG5SVUpCVDJkekNtdFVjMkpwYlUwMFQyMVFNa2xFV1ZaUFEyOWtjVVpSWlZsaVUzZHpiR1l6YldzeWNsbFBXSEJ1Yms1aVoyOHZiMVIzYVc5UmRuUTBTek54VkV0RGRVa0tUbVpPY1hNNFZVcHplVWhvUVZoNVZXTlhSRms1Uld0cU4waERTV2s1YWpSbGJqZGxRekI1WjJWdUwweGhRMVpCZDBGMmQweE9lalJzYjFaRk5HZHNZUXBGTTBveldWZG1WV1Y2Um1sV2VVdG1kMGx3TTFVMU9WUXdaRVI2YVVNM2QwbzJVelI0WWtsNU1XVjVXbGc0SzFWMmJEWnBVMDVGY2pCdWNIQldlWEJzQ2tkUmF6QlpjRm8wSzJnd2NYRnBkVk5VWVRsaWFFZ3JVazVuV2pCTGRsUmhNV0ZITTFWT1ZYWlhTMVZXYUd0TWRFcGtkM2hXUVRWaEt5dG1aRWQ1ZWxFS1EzUTVjbTVEYVRSR09HWmlZVEoyWXpoSVRXOVhNRmxIVUhkR2JUUlJSMkZ4Vkhkak15czNTR1JxVUhGb2NFUk1ZblZ5TmtzM1Z6Z3hSblpMVFVOclF3b3pTM2RQVTI1UVdrUmhOR2Q1U2pSd1JITkZRMEYzUlVGQllVNURUVVZCZDBSbldVUldVakJRUVZGSUwwSkJVVVJCWjB0clRVRTRSMEV4VldSRmQwVkNDaTkzVVVaTlFVMUNRV1k0ZDBoUldVUldVakJQUWtKWlJVWkdSa1ZUVm5WQlZHWXJla2x4VVRkVldVZE9hamx6WTJrck9IaE5RVEJIUTFOeFIxTkpZak1LUkZGRlFrTjNWVUZCTkVsQ1FWRkNNamxRVm1sdFduVnNjRlV2WjJod2QzVnBjVGhWYlU5VWRWaG9TR1pQY3pJd1NXRTRRbmdyU0hKRmJUQkxWamhNTHdwWVVrWjFZMjk0VjBRdk5rdHBaRkZaYTBKV1ltdFhiMEZwYjNOV2FFOUlNa042ZUVsR05scEdNelI2Y0hCMldITldlRFZyVldKVlYyVnNjRFpxTkM4M0NpdG5Ra1U1UjFsRlJHdHdRVE5hTmpaSVVVMTBVVkZhZFVkWVRuaG5NazlzUTFWdmNDOVlRWHB1YzA5dVVWcGpiWFp5VkdvMlprdHBOa3BTVjAxNmVHMEtRV0p6SzJndlNVOWxWMGhwZWsxeVpYTnhWRWMxUmxCNVdXOTFSbkpCYzBodGR6aFlSSFZITXpOMFpqRkNZMW9yVEZkWk9XUnpkeXRETmxodk5VMXdWQW94ZWpCaWRXazFhbXhxTUdneFpGZFdORXRzVUhWMFpqRlBMMjlZY1d4dU0zRTVhVm8wVUhsaU1GVlJTMFU1SzBrMlNURmhSVU13WmpORlVtaG1kVkF5Q25Jd1UxQXhOMFY1ZUZrcmVHTm9NMjAxUWpOcFdtMTVWMHRHVlZoalUxZHlWbWMxVEFvdExTMHRMVVZPUkNCRFJWSlVTVVpKUTBGVVJTMHRMUzB0Q2c9PScgLS1hcGlzZXJ2ZXItZW5kcG9pbnQgJ2h0dHBzOi8vMDIxRTgxRjMxODFGOTBCOUUwODA0NzQwMkI1M0Y2QzQuZ3I3LnVzLWVhc3QtMi5la3MuYW1hem9uYXdzLmNvbScgIC0ta3ViZWxldC1leHRyYS1hcmdzICItLW5vZGUtbGFiZWxzPWdlZWlxL25vZGUtdHlwZT1vcHMiICdnZWVpcS1wcm9kLWs4cycKCiMgQWxsb3cgdXNlciBzdXBwbGllZCB1c2VyZGF0YSBjb2RlCgo=" -> null
      - vpc_classic_link_security_groups = [] -> null

      - root_block_device {
          - delete_on_termination = true -> null
          - encrypted             = false -> null
          - iops                  = 0 -> null
          - throughput            = 0 -> null
          - volume_size           = 250 -> null
          - volume_type           = "gp2" -> null
        }
    }

  # module.eks.random_pet.workers[2] must be replaced
+/- resource "random_pet" "workers" {
      ~ id        = "proud-worm" -> (known after apply)
      ~ keepers   = {
          - "lc_name" = "geeiq-prod-k8s-ops-nodes20220409085841756100000003"
        } -> (known after apply) # forces replacement
        # (2 unchanged attributes hidden)
    }

  # module.eks.random_pet.workers[3] will be destroyed
  - resource "random_pet" "workers" {
      - id        = "vocal-monitor" -> null
      - keepers   = {
          - "lc_name" = "geeiq-prod-k8s-op-nodes20220409095305101600000002"
        } -> null
      - length    = 2 -> null
      - separator = "-" -> null
    }

Plan: 3 to add, 0 to change, 7 to destroy.

Additional context

New node group i want to keep : geeiq-prod-k8s-op-nodes20220409095313235600000003

Old node group i want to delete : geeiq-prod-k8s-ops-nodes20210422145135744400000006


I'm aware this is an old version of this module but unfortunately we don't have the time or resources right now to make the upgrade to the latest version.

From what i can understand it thinks the new op-node is an upgrade of original ops-node rather than a completely new node group. Is anyone able to offer advise on what i can do to separate them. ( i assumed a different name would have been enough )

module.eks.aws_launch_configuration.workers[0]: Refreshing state... [id=geeiq-prod-k8s-worker-nodes20220409085841756000000002]
module.eks.aws_launch_configuration.workers[2]: Refreshing state... [id=geeiq-prod-k8s-ops-nodes20220409085841756100000003]
module.eks.aws_launch_configuration.workers[1]: Refreshing state... [id=geeiq-prod-k8s-cron-nodes20220409085841755200000001]
module.eks.aws_launch_configuration.workers[3]: Refreshing state... [id=geeiq-prod-k8s-op-nodes20220409095305101600000002]

image

@kaykhancheckpoint kaykhancheckpoint changed the title Planned changes attempting to remove a newly created node group Deleting an old node group is attempting to overwrite a newly created node group Apr 9, 2022
@bryantbiggs
Copy link
Member

yes, unfortunately this is one of the drawbacks of the versions prior to v18.x - the solution would be to upgrade to v18.x to avoid this disruptive behavior

@kaykhancheckpoint
Copy link
Author

kaykhancheckpoint commented Apr 9, 2022

@bryantbiggs ye i thought that might be the case.... hmm i can't upgrade now unfortunately.

I want to clean this up, do you think it would be enough to just delete the autoscaling group manually in the aws dashboard and also the corresponding terraform state.

geeiq-prod-k8s-ops-nodes20210422145135744400000006
terraform state rm module.eks.aws_autoscaling_group.workers[2]

and i guess the launch template right?

@bryantbiggs
Copy link
Member

you could try - in the end you need the order in state to match the order of your code (array index order that is) otherwise all node groups after the affected node group index are at risk of re-creation

@kaykhancheckpoint
Copy link
Author

hmm ok

@bryantbiggs
Copy link
Member

honestly, if you're going through all this I would just upgrade and be done with this problem.

  1. Upgrade per upgrade guide
  2. Set control plane settings to v17.x syntax to avoid replacement
  3. Remove node groups from Terraform state management (still exist, just not controlled by Terraform)
  4. Provision new node groups based on existing settings
  5. Cordon and drain old node groups
  6. Once last pods are moved off old node groups, manually delete

@kaykhancheckpoint
Copy link
Author

As far as i remember, trying to upgrade from 15 -> 18 caused terraform changes that would attempt to recreate the eks cluster it self, so i stopped attempting that and just worked with what i have

@bryantbiggs
Copy link
Member

bryantbiggs commented Apr 9, 2022

I don't know about coming from v15, but for most coming from v17 the following worked to avoid replacing the control plane:

prefix_separator                   = ""
iam_role_name                      = $CLUSTER_NAME
cluster_security_group_name        = $CLUSTER_NAME
cluster_security_group_description = "EKS cluster security group."

Ref: #1744 (comment)

@kaykhancheckpoint
Copy link
Author

kaykhancheckpoint commented Apr 9, 2022

I don't know about coming from v15, but for most coming from v17 the following worked to avoid replacing the control plane:

prefix_separator                   = ""
iam_role_name                      = $CLUSTER_NAME
cluster_security_group_name        = $CLUSTER_NAME
cluster_security_group_description = "EKS cluster security group."

Ref: #1744 (comment)

Yeah i am aware of that thread, i attempted that on a test environment and for me it was still attempting to recreate the eks cluster itself. You can see i even made some comments on that thread and got stuck, it seems like v15 cause additional issues

#1744 (comment)

@kaykhancheckpoint
Copy link
Author

kaykhancheckpoint commented Apr 9, 2022

@bryantbiggs

Does this problem ( in this issue) occur also in version 17? If not i might attempt a 15 -> 17 upgrade

@bryantbiggs
Copy link
Member

yes, there were a number of issues related to this in v17.x that we fixed in v18.x such as #1105

@kaykhancheckpoint
Copy link
Author

kaykhancheckpoint commented Apr 9, 2022

Okay~ i think the best option here is just to eventually recreate the eks cluster at some point and migrate my apps across.

As the main problem with following the upgrade guides and that main upgrade help thread is the terraform state names which people suggest to manipulate does not correspond with what i have in my state file.

As a temporary solution, ive left the configuration for the existing/old ops-nodes and set the min/desired node count to 0.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 12, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants