Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new karpenter policy fails #2306

Closed
1 task done
FernandoMiguel opened this issue Nov 22, 2022 · 12 comments · Fixed by #2308
Closed
1 task done

new karpenter policy fails #2306

FernandoMiguel opened this issue Nov 22, 2022 · 12 comments · Fixed by #2308

Comments

@FernandoMiguel
Copy link
Contributor

Description

Please provide a clear and concise description of the issue you are encountering, and a reproduction of your configuration (see the examples/* directory for references that you can copy+paste and tailor to match your configs if you are unable to copy your exact configuration). The reproduction MUST be executable by running terraform init && terraform apply without any further changes.

If your request is for a new feature, please use the Feature request template.

  • ✋ I have searched the open/closed issues and my issue is not listed.

⚠️ Note

Before you submit an issue, please perform the following first:

  1. Remove the local .terraform directory (! ONLY if state is stored remotely, which hopefully you are following that best practice!): rm -rf .terraform/
  2. Re-initialize the project root to pull down modules: terraform init
  3. Re-attempt your terraform plan or apply and check if the issue still persists

Versions

  • Module version [Required]:

  • Terraform version:

Terraform v1.4.0-alpha20221109
on darwin_arm64
+ provider registry.terraform.io/gavinbunney/kubectl v1.14.0
+ provider registry.terraform.io/hashicorp/aws v4.40.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/helm v2.7.1
+ provider registry.terraform.io/hashicorp/http v3.2.1
+ provider registry.terraform.io/hashicorp/kubernetes v2.16.0
+ provider registry.terraform.io/hashicorp/null v3.2.1
+ provider registry.terraform.io/hashicorp/random v3.4.3
+ provider registry.terraform.io/hashicorp/time v0.9.1
+ provider registry.terraform.io/hashicorp/vault v3.11.0
  • Provider version(s):

Reproduction Code [Required]

module "node_termination_queue" {
  source  = "terraform-aws-modules/eks/aws//modules/karpenter"
  version = "~> 18.31"

  cluster_name              = var.eks_cluster_id
  queue_name                = "${var.eks_cluster_id}-${random_uuid.random_uuid.result}"
  enable_spot_termination   = true
  queue_managed_sse_enabled = true

  create_iam_role         = false
  create_instance_profile = false
  create_irsa             = false
}
output "node_termination_queue" { value = module.node_termination_queue }
output "node_termination_queue_queue_name" { value = module.node_termination_queue.queue_name }

Steps to reproduce the behavior:

Expected behavior

Actual behavior

│ Error: Invalid for_each argument
│
│   on .terraform/modules/base_system.karpenter.node_termination_queue/modules/karpenter/main.tf line 324, in resource "aws_iam_role_policy_attachment" "this":324:   for_each = { for k, v in toset(compact([
│  325:     "${local.iam_role_policy_prefix}/AmazonEKSWorkerNodePolicy",
│  326:     "${local.iam_role_policy_prefix}/AmazonEC2ContainerRegistryReadOnly",
│  327:     var.iam_role_attach_cni_policy ? local.cni_policy : "",
│  328:   ])) : k => v if local.create_iam_role }
│     ├────────────────
│     │ local.cni_policy is a string, known only after apply
│     │ local.create_iam_role is false
│     │ local.iam_role_policy_prefix is a string, known only after apply
│     │ var.iam_role_attach_cni_policy is true
│
│ The "for_each" map includes keys derived from resource attributes that cannot be determined until apply, and so Terraform cannot determine the full set of keys
│ that will identify the instances of this resource.
│
│ When working with unknown values in for_each, it's better to define the map keys statically in your configuration and place apply-time results only in the map
│ values.
│
│ Alternatively, you could use the -target planning option to first apply only the resources that the for_each value depends on, and then apply a second time to
│ fully converge.

Terminal Output Screenshot(s)

Additional context

@o-sama
Copy link

o-sama commented Nov 22, 2022

This is the current way this is done for other resources (Fargate in the example below):

resource "aws_iam_role_policy_attachment" "this" {
  for_each = local.create_iam_role ? toset(compact(distinct(concat([
    "${local.policy_arn_prefix}/AmazonEKSClusterPolicy",
    "${local.policy_arn_prefix}/AmazonEKSVPCResourceController",
  ], var.iam_role_additional_policies)))) : toset([])

  policy_arn = each.value
  role       = aws_iam_role.this[0].name
}

This is how it's done in the Karpenter module:

resource "aws_iam_role_policy_attachment" "this" {
  for_each = { for k, v in toset(compact([
    "${local.iam_role_policy_prefix}/AmazonEKSWorkerNodePolicy",
    "${local.iam_role_policy_prefix}/AmazonEC2ContainerRegistryReadOnly",
    var.iam_role_attach_cni_policy ? local.cni_policy : "",
  ])) : k => v if local.create_iam_role }

  policy_arn = each.value
  role       = aws_iam_role.this[0].name
}

Since the first example above works, maybe refactoring to match that is the way to go since it should fix the issue and keep uniformity across the codebase. Thoughts?

locals {
  # renaming this since it's just a selector
  cni_policy_select = var.cluster_ip_family == "ipv6" ? "${local.iam_role_policy_prefix}/AmazonEKS_CNI_IPv6_Policy" : "${local.iam_role_policy_prefix}/AmazonEKS_CNI_Policy"
  # this is now defined for us to add to the for_each value
  cni_policy             = var.iam_role_attach_cni_policy ? local.cni_policy_select : ""
}
# don't need distinct() here since toset should do that already
resource "aws_iam_role_policy_attachment" "this" {
  for_each = local.create_iam_role ? toset(compact(
    "${local.policy_arn_prefix}/AmazonEKSWorkerNodePolicy",
    "${local.policy_arn_prefix}/AmazonEC2ContainerRegistryReadOnly",
     local.cni_policy)) : toset([])

  policy_arn = each.value
  role       = aws_iam_role.this[0].name
}

Thoughts? It's a little more verbose but should work. I will try to test this out later today before making a PR.

@RafPe
Copy link

RafPe commented Nov 22, 2022

I can add another example which I just came across. Trying with just one input added - create so I could conditionally create the resources.

module "karpenter" {
  source  = "terraform-aws-modules/eks/aws//modules/karpenter"
  version = "v18.31.1"

  create                          = var.create
  cluster_name                    = module.eks.cluster_name
  irsa_oidc_provider_arn          = module.eks.oidc_provider_arn
  irsa_namespace_service_accounts = ["karpenter:karpenter"]
  create_iam_role                 = false
  iam_role_arn                    = module.sys_eks_managed_node_group.iam_role_arn
  tags                            = merge(local.tags, {})

}

results in

│ Error: Invalid index
│
│   on .terraform/modules/eks.karpenter/modules/karpenter/main.tf line 160, in resource "aws_iam_policy" "irsa":
│  160:   policy      = data.aws_iam_policy_document.irsa[0].json
│     ├────────────────
│     │ data.aws_iam_policy_document.irsa is empty tuple
│
│ The given key does not identify an element in this collection value: the collection has no elements.

Which gives indication that the conditionals on the module are not correct ( or am I missing something here ) ?

@bryantbiggs
Copy link
Member

I'm not able to reproduce - this works:

module "karpenter" {
  source  = "terraform-aws-modules/eks/aws//modules/karpenter"
  version = "18.31.0"

  cluster_name = module.eks.cluster_name

  irsa_oidc_provider_arn          = module.eks.oidc_provider_arn
  irsa_namespace_service_accounts = ["karpenter:karpenter"]

  # Since Karpenter is running on an EKS Managed Node group,
  # we can re-use the role that was created for the node group
  create_iam_role = false
  iam_role_arn    = module.eks.eks_managed_node_groups["initial"].iam_role_arn
}

@RafPe
Copy link

RafPe commented Nov 22, 2022

@bryantbiggs please try with create = false to conditionally disable/enable the module

module "karpenter" {
  source  = "terraform-aws-modules/eks/aws//modules/karpenter"
  version = "18.31.0"

  create = false 
  cluster_name = module.eks.cluster_name

  irsa_oidc_provider_arn          = module.eks.oidc_provider_arn
  irsa_namespace_service_accounts = ["karpenter:karpenter"]

  # Since Karpenter is running on an EKS Managed Node group,
  # we can re-use the role that was created for the node group
  create_iam_role = false
  iam_role_arn    = module.eks.eks_managed_node_groups["initial"].iam_role_arn
}

@FernandoMiguel
Copy link
Contributor Author

@bryantbiggs we already have karpenter in place, via blueprints.
all we want to use from this module, for now, is NTH.
That's why that is all I've enabled.

Is this a supported use of this module?

@bryantbiggs
Copy link
Member

@bryantbiggs we already have karpenter in place, via blueprints. all we want to use from this module, for now, is NTH. That's why that is all I've enabled.

Is this a supported use of this module?

absolutely - should be patched shortly with #2308

@antonbabenko
Copy link
Member

This issue has been resolved in version 18.31.2 🎉

@FernandoMiguel
Copy link
Contributor Author

just tested this on 18.31.2 on a new cluster and on 1st apply it fails.

module "node_termination_queue" {
  source  = "terraform-aws-modules/eks/aws//modules/karpenter"
  version = ">= 18.31.2, ~> 18.31"

  cluster_name              = var.eks_cluster_id
  queue_name                = var.eks_cluster_id # Queue Name needs to match the cluster name do to IAM policies # A queue name is case-sensitive and can have up to 80 characters. You can use alphanumeric characters, hyphens (-), and underscores ( _ ). # Rule name can not be longer than 64, which limits cluster_name to 35 characters.
  enable_spot_termination   = true
  queue_managed_sse_enabled = true

  create_iam_role            = false
  create_instance_profile    = false
  create_irsa                = false
  iam_role_attach_cni_policy = false
}
output "node_termination_queue" { value = module.node_termination_queue }
output "node_termination_queue_queue_name" { value = module.node_termination_queue.queue_name }
╷
│ Error: Invalid for_each argument
│
│   on .terraform/modules/base_system.karpenter.node_termination_queue/modules/karpenter/main.tf line 324, in resource "aws_iam_role_policy_attachment" "this":324:   for_each = { for k, v in toset(compact([
│  325:     "${local.iam_role_policy_prefix}/AmazonEKSWorkerNodePolicy",
│  326:     "${local.iam_role_policy_prefix}/AmazonEC2ContainerRegistryReadOnly",
│  327:     var.iam_role_attach_cni_policy ? local.cni_policy : "",
│  328:   ])) : k => v if local.create_iam_role }
│     ├────────────────
│     │ local.cni_policy is a string, known only after apply
│     │ local.create_iam_role is false
│     │ local.iam_role_policy_prefix is a string, known only after apply
│     │ var.iam_role_attach_cni_policy is false
│
│ The "for_each" map includes keys derived from resource attributes that cannot be determined until apply, and so Terraform cannot determine the full set of keys that will identify the instances of this resource.
│
│ When working with unknown values in for_each, it's better to define the map keys statically in your configuration and place apply-time results only in the map values.
│
│ Alternatively, you could use the -target planning option to first apply only the resources that the for_each value depends on, and then apply a second time to fully converge.

Please reopen this issue @antonbabenko

@antonbabenko antonbabenko reopened this Nov 26, 2022
@bryantbiggs
Copy link
Member

closing since I am unable to reproduce:

Reproduction

module "node_termination_queue" {
  source  = "terraform-aws-modules/eks/aws//modules/karpenter"
  version = ">= 18.31.2, ~> 18.31"

  cluster_name              = "example"
  queue_name                = "foo"
  enable_spot_termination   = true
  queue_managed_sse_enabled = true

  create_iam_role            = false
  create_instance_profile    = false
  create_irsa                = false
  iam_role_attach_cni_policy = false
}

output "node_termination_queue" { 
  value = module.node_termination_queue 
}
output "node_termination_queue_queue_name" { 
  value = module.node_termination_queue.queue_name 
}

Output

Apply complete! Resources: 10 added, 0 changed, 0 destroyed.

Outputs:

node_termination_queue = {
  "event_rules" = {
    "health_event" = {
      "arn" = "arn:aws:events:us-east-1:11111111111:rule/KarpenterHealthEvent-example"
      "description" = "Karpenter interrupt - AWS health event"
      "event_bus_name" = "default"
      "event_pattern" = "{\"detail-type\":[\"AWS Health Event\"],\"source\":[\"aws.health\"]}"
      "id" = "KarpenterHealthEvent-example"
      "is_enabled" = true
      "name" = "KarpenterHealthEvent-example"
      "name_prefix" = ""
      "role_arn" = ""
      "schedule_expression" = ""
      "tags" = tomap(null) /* of string */
      "tags_all" = tomap({})
    }
    "instance_rebalance" = {
      "arn" = "arn:aws:events:us-east-1:11111111111:rule/KarpenterInstanceRebalance-example"
      "description" = "Karpenter interrupt - EC2 instance rebalance recommendation"
      "event_bus_name" = "default"
      "event_pattern" = "{\"detail-type\":[\"EC2 Instance Rebalance Recommendation\"],\"source\":[\"aws.ec2\"]}"
      "id" = "KarpenterInstanceRebalance-example"
      "is_enabled" = true
      "name" = "KarpenterInstanceRebalance-example"
      "name_prefix" = ""
      "role_arn" = ""
      "schedule_expression" = ""
      "tags" = tomap(null) /* of string */
      "tags_all" = tomap({})
    }
    "instance_state_change" = {
      "arn" = "arn:aws:events:us-east-1:11111111111:rule/KarpenterInstanceStateChange-example"
      "description" = "Karpenter interrupt - EC2 instance state-change notification"
      "event_bus_name" = "default"
      "event_pattern" = "{\"detail-type\":[\"EC2 Instance State-change Notification\"],\"source\":[\"aws.ec2\"]}"
      "id" = "KarpenterInstanceStateChange-example"
      "is_enabled" = true
      "name" = "KarpenterInstanceStateChange-example"
      "name_prefix" = ""
      "role_arn" = ""
      "schedule_expression" = ""
      "tags" = tomap(null) /* of string */
      "tags_all" = tomap({})
    }
    "spot_interupt" = {
      "arn" = "arn:aws:events:us-east-1:11111111111:rule/KarpenterSpotInterrupt-example"
      "description" = "Karpenter interrupt - EC2 spot instance interruption warning"
      "event_bus_name" = "default"
      "event_pattern" = "{\"detail-type\":[\"EC2 Spot Instance Interruption Warning\"],\"source\":[\"aws.ec2\"]}"
      "id" = "KarpenterSpotInterrupt-example"
      "is_enabled" = true
      "name" = "KarpenterSpotInterrupt-example"
      "name_prefix" = ""
      "role_arn" = ""
      "schedule_expression" = ""
      "tags" = tomap(null) /* of string */
      "tags_all" = tomap({})
    }
  }
  "instance_profile_arn" = null
  "instance_profile_id" = null
  "instance_profile_name" = null
  "instance_profile_unique" = null
  "irsa_arn" = null
  "irsa_name" = null
  "irsa_unique_id" = null
  "queue_arn" = "arn:aws:sqs:us-east-1:11111111111:foo"
  "queue_name" = "foo"
  "queue_url" = "https://sqs.us-east-1.amazonaws.com/660548353186/foo"
  "role_arn" = tostring(null)
  "role_name" = null
  "role_unique_id" = null
}
node_termination_queue_queue_name = "foo"

@FernandoMiguel
Copy link
Contributor Author

I'll retest tomorrow.

@FernandoMiguel
Copy link
Contributor Author

just tested on a new cluster, using v19.4

module "node_termination_queue" {
  source  = "terraform-aws-modules/eks/aws//modules/karpenter"
  version = "~> 19.4"

  cluster_name = var.eks_cluster_id
  # queue_name   = var.eks_cluster_id
  # Queue Name needs to match the cluster name do to IAM policies
  # A queue name is case-sensitive and can have up to 80 characters. You can use alphanumeric characters, hyphens (-), and underscores ( _ ).
  # Rule name can not be longer than 64, which limits cluster_name to 35 characters.

  enable_spot_termination   = true
  queue_managed_sse_enabled = true

  create_iam_role            = false
  create_instance_profile    = false
  create_irsa                = false
  iam_role_attach_cni_policy = false
}
output "node_termination_queue" { value = module.node_termination_queue }
output "node_termination_queue_queue_name" { value = module.node_termination_queue.queue_name }
│ Error: Invalid for_each argument
│
│   on .terraform/modules/base_system.karpenter.node_termination_queue/modules/karpenter/main.tf line 327, in resource "aws_iam_role_policy_attachment" "this":
│  327:   for_each = { for k, v in toset(compact([
│  328:     "${local.iam_role_policy_prefix}/AmazonEKSWorkerNodePolicy",
│  329:     "${local.iam_role_policy_prefix}/AmazonEC2ContainerRegistryReadOnly",
│  330:     var.iam_role_attach_cni_policy ? local.cni_policy : "",
│  331:   ])) : k => v if local.create_iam_role }
│     ├────────────────
│     │ local.cni_policy is a string, known only after apply
│     │ local.create_iam_role is false
│     │ local.iam_role_policy_prefix is a string, known only after apply
│     │ var.iam_role_attach_cni_policy is false
│
│ The "for_each" map includes keys derived from resource attributes that cannot be determined until apply, and so Terraform cannot determine the full set of keys that will identify the instances of this resource.
│
│ When working with unknown values in for_each, it's better to define the map keys statically in your configuration and place apply-time results only in the map values.
│
│ Alternatively, you could use the -target planning option to first apply only the resources that the for_each value depends on, and then apply a second time to fully converge.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants