Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hardcoded dash in name-prefix forces replacement of node groups when upgrading to 18x #2153

Closed
1 task done
zhujik opened this issue Jul 7, 2022 · 4 comments
Closed
1 task done

Comments

@zhujik
Copy link

zhujik commented Jul 7, 2022

Description

Due to https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/modules/eks-managed-node-group/main.tf#L49 and https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/modules/self-managed-node-group/main.tf#L55, when upgrading from v1.17.x to v1.18.x, if the name of the nodegroup did not have that trailing dash before, the nodegroup will always be forcefully replaced. We should be able to circumvent this by specifing name_prefix ourselves, it would be enough to just remove that hard-coded dash altogether.

  • ✋ I have searched the open/closed issues and my issue is not listed.

⚠️ Note

Versions

  • Module version [Required]:

Terraform v1.1.7
on linux_amd64

  • provider registry.terraform.io/hashicorp/aws v3.74.3
  • provider registry.terraform.io/hashicorp/cloudinit v2.2.0
  • provider registry.terraform.io/hashicorp/external v2.2.2
  • provider registry.terraform.io/hashicorp/helm v2.4.1
  • provider registry.terraform.io/hashicorp/kubernetes v2.10.0
  • provider registry.terraform.io/hashicorp/local v2.1.0
  • provider registry.terraform.io/hashicorp/null v3.1.1
  • provider registry.terraform.io/hashicorp/random v3.1.3
  • provider registry.terraform.io/hashicorp/time v0.7.2
  • provider registry.terraform.io/hashicorp/tls v3.4.0
  • provider registry.terraform.io/jfrog/artifactory v2.6.21
  • provider registry.terraform.io/terraform-aws-modules/http v2.4.1

Reproduction Code [Required]

Steps to reproduce the behavior:

  • create a v1.17.x nodegroup (unmanaged or managed) with no trailing dash as name.
  • upgrade to v1.18.x

Expected behavior

There should be away to avoid nodegroup replacement

Actual behavior

  • no matter your configuration, the nodegroup is replaced.
@bryantbiggs
Copy link
Member

This change was intentional and made during a breaking change. Users can remove the current node group from Terraform control (i.e. - terraform state rm ...) and provision a new node group, cordon and drain the old node group and then remove once the last pod has migrated. This process should gracefully migrate pods from the old node group to the new without disruption

@zhujik
Copy link
Author

zhujik commented Jul 22, 2022

@bryantbiggs please elaborate. This does not work that simply, due to iam roles changing and security groups being shuffled around and replaced, as soon as I due that, I end up with nodes in the old (self-managed) nodegroup that are in the NotReady state and all pods on them are unreachable.

edit: ok I try things mentioned in #1744 first, did just now discover that discussion

@ghost
Copy link

ghost commented Sep 20, 2022

+1

@github-actions
Copy link

github-actions bot commented Nov 8, 2022

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants