Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Breaking change" in 15.2.0? #1382

Closed
endrec opened this issue May 21, 2021 · 9 comments
Closed

"Breaking change" in 15.2.0? #1382

endrec opened this issue May 21, 2021 · 9 comments

Comments

@endrec
Copy link

endrec commented May 21, 2021

Description

Prior 15.2.0, when manage_cluster_iam_resources was enabled, cluster_iam_role_name was completely ignored.
Now, 15.2.0 started to use cluster_iam_role_name as a name override in this scenario.

Prior 15.2.0, it was safe to pass null in cluster_iam_role_name, now that forces a a whole cluster replacement on upgrade, as the change in #1199 looks for an empty string and does not accepts null.

Okay, it's not a breaking change as such, but it would worth a mention in the changelog, I believe. :)

Versions

  • Terraform: v0.14.5
  • Provider(s):
    • provider registry.terraform.io/hashicorp/aws v3.42.0
    • provider registry.terraform.io/hashicorp/cloudinit v2.2.0
    • provider registry.terraform.io/hashicorp/kubernetes v2.2.0
    • provider registry.terraform.io/hashicorp/local v2.1.0
    • provider registry.terraform.io/hashicorp/null v3.1.0
    • provider registry.terraform.io/hashicorp/random v3.1.0
    • provider registry.terraform.io/hashicorp/template v2.2.0
  • Module: 15.1.0 --> 15.2.0

Reproduction

Steps to reproduce the behavior:

  1. Create a cluster using 15.1.0 (or earlier), and set
    manage_cluster_iam_resources = true
    cluster_iam_role_name = null
    
    Observe the cluster is created.
  2. After the cluster is up, upgrade module version to 15.2.0, but do not change the settings
  3. Execute terraform plan and observe that cluster is going to be replaced.

Expected behavior

After upgrading the module, the cluster should not be replaced (there are 2 minor changes):

Plan: 0 to add, 2 to change, 0 to destroy.

Actual behavior

After the upgrade, the whole cluster is replaced:

Plan: 8 to add, 1 to change, 8 to destroy.

@daroga0002
Copy link
Contributor

It looks that this introduced bug as this logic looks without sense as condition is same:

name_prefix = var.cluster_iam_role_name != "" ? null : var.cluster_name
name = var.cluster_iam_role_name != "" ? var.cluster_iam_role_name : null

so in your case probably trying to replace fixed by name_prefix role.

Can you paste here changes from terrafom plan for resource:
aws_eks_cluster.this
to confirm this?

@endrec
Copy link
Author

endrec commented May 24, 2021

@daroga0002 , thanks for looking into this.

aws_eks_cluster.this is kind of at the end of the chain of events, module.eks.aws_iam_role.cluster[0] gets replaced because of the name/prefix change, and the new role forces everything else to be replaced.

Here is my output after removing identifiable information and some irrelevant details:

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  ~ update in-place
-/+ destroy and then create replacement
+/- create replacement and then destroy

Terraform will perform the following actions:

  # module.eks.aws_eks_cluster.this[0] must be replaced
+/- resource "aws_eks_cluster" "this" {
      ~ arn                       = "XXXX" -> (known after apply)
      [...]
      ~ role_arn                  = "XXXX" -> (known after apply) # forces replacement
      ~ status                    = "ACTIVE" -> (known after apply)
      [...]
    }

  # module.eks.aws_iam_openid_connect_provider.oidc_provider[0] must be replaced
-/+ resource "aws_iam_openid_connect_provider" "oidc_provider" {
      ~ arn             = "XXXX" -> (known after apply)
      [...]
      ~ url             = "XXXX" -> (known after apply) # forces replacement
      [...]
    }

  # module.eks.aws_iam_policy.cluster_elb_sl_role_creation[0] will be updated in-place
  ~ resource "aws_iam_policy" "cluster_elb_sl_role_creation" {
      [...]
    }

  # module.eks.aws_iam_role.cluster[0] must be replaced
+/- resource "aws_iam_role" "cluster" {
      ~ arn                   = "XXXX" -> (known after apply)
      [...]
      ~ name                  = "XXXX" -> (known after apply)
      - name_prefix           = "XXXX" -> null # forces replacement
      [...]
    }

  # module.eks.aws_iam_role_policy_attachment.cluster_AmazonEKSClusterPolicy[0] must be replaced
+/- resource "aws_iam_role_policy_attachment" "cluster_AmazonEKSClusterPolicy" {
      [...] # role forces replacement
    }

  # module.eks.aws_iam_role_policy_attachment.cluster_AmazonEKSServicePolicy[0] must be replaced
+/- resource "aws_iam_role_policy_attachment" "cluster_AmazonEKSServicePolicy" {
      [...] # role forces replacement
    }

  # module.eks.aws_iam_role_policy_attachment.cluster_AmazonEKSVPCResourceControllerPolicy[0] must be replaced
+/- resource "aws_iam_role_policy_attachment" "cluster_AmazonEKSVPCResourceControllerPolicy" {
      [...] # role forces replacement
    }

  # module.eks.aws_iam_role_policy_attachment.cluster_elb_sl_role_creation[0] must be replaced
-/+ resource "aws_iam_role_policy_attachment" "cluster_elb_sl_role_creation" {
      [...] # role forces replacement
    }

  # module.eks.local_file.kubeconfig[0] must be replaced
-/+ resource "local_file" "kubeconfig" {
      ~ content              = <<-EOT
      [...]
        EOT -> (known after apply) # forces replacement
      [...]
    }

Plan: 8 to add, 1 to change, 8 to destroy.

@daroga0002
Copy link
Contributor

I have opened PR #1417 which should solve it.

@barryib

@barryib
Copy link
Member

barryib commented May 31, 2021

To me #1199 put the same logic on both var.workers_role_name and var.cluster_iam_role_name.

The point here, is we don't test if var.cluster_iam_role_name or var.workers_role_name are not null, we only check if there are empty.

Prior 15.2.0, when manage_cluster_iam_resources was enabled, cluster_iam_role_name was completely ignored.
Now, 15.2.0 started to use cluster_iam_role_name as a name override in this scenario.
Prior 15.2.0, it was safe to pass null in cluster_iam_role_name, now that forces a a whole cluster replacement on upgrade, as the change in #1199 looks for an empty string and does not accepts null.

It wasn't safe, you were relying on incorrect behavior when the module manage IAM role.

To fix this, what I see is one of following :

  • Unset your cluster_iam_role_name (remove it from your configuration), because you don't need it.
  • Set the cluster_iam_role_name="". This is the default value.
  • Change de tests inside module to check if cluster_iam_role_name != "" || cluster_iam_role_name != null. But we go this way, we should change this in all string verification within this module.

@endrec
Copy link
Author

endrec commented May 31, 2021

Thanks @daroga0002 and @barryib.

I think I unintentionally opened a can of worms... 😃

In my opinion, no change of the tests are necessary, however the behaviour should be better documented.

I'm using the module in a wrapper module, which may or may not set cluster_iam_role_name. Previously I set in null when my module were not going to specify the name, which I had to change to "" now. Not a big deal, really, I was just complaining that I had to figure this out by digging into the source of this module, instead of it being clear from the change log. 😄

@barryib
Copy link
Member

barryib commented May 31, 2021

@endrec Not a big deal. please, see the variable documentation. I think It was mentioned there. But feel free to update to add precision if needed.

Thanks for opening this question, we'll see how we can improve our docs.

@barryib
Copy link
Member

barryib commented May 31, 2021

So can we close this issue ?

@endrec
Copy link
Author

endrec commented May 31, 2021

can we close this issue ?

Yeah, I think we can. :)

@endrec endrec closed this as completed May 31, 2021
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants