new karpenter policy fails #2306

FernandoMiguel · 2022-11-22T11:59:25Z

Description

Please provide a clear and concise description of the issue you are encountering, and a reproduction of your configuration (see the examples/* directory for references that you can copy+paste and tailor to match your configs if you are unable to copy your exact configuration). The reproduction MUST be executable by running terraform init && terraform apply without any further changes.

If your request is for a new feature, please use the Feature request template.

✋ I have searched the open/closed issues and my issue is not listed.

⚠️ Note

Before you submit an issue, please perform the following first:

Remove the local .terraform directory (! ONLY if state is stored remotely, which hopefully you are following that best practice!): rm -rf .terraform/
Re-initialize the project root to pull down modules: terraform init
Re-attempt your terraform plan or apply and check if the issue still persists

Versions

Module version [Required]:
Terraform version:

Terraform v1.4.0-alpha20221109
on darwin_arm64
+ provider registry.terraform.io/gavinbunney/kubectl v1.14.0
+ provider registry.terraform.io/hashicorp/aws v4.40.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/helm v2.7.1
+ provider registry.terraform.io/hashicorp/http v3.2.1
+ provider registry.terraform.io/hashicorp/kubernetes v2.16.0
+ provider registry.terraform.io/hashicorp/null v3.2.1
+ provider registry.terraform.io/hashicorp/random v3.4.3
+ provider registry.terraform.io/hashicorp/time v0.9.1
+ provider registry.terraform.io/hashicorp/vault v3.11.0

Provider version(s):

Reproduction Code [Required]

module "node_termination_queue" {
  source  = "terraform-aws-modules/eks/aws//modules/karpenter"
  version = "~> 18.31"

  cluster_name              = var.eks_cluster_id
  queue_name                = "${var.eks_cluster_id}-${random_uuid.random_uuid.result}"
  enable_spot_termination   = true
  queue_managed_sse_enabled = true

  create_iam_role         = false
  create_instance_profile = false
  create_irsa             = false
}
output "node_termination_queue" { value = module.node_termination_queue }
output "node_termination_queue_queue_name" { value = module.node_termination_queue.queue_name }

Steps to reproduce the behavior:

Expected behavior

Actual behavior

│ Error: Invalid for_each argument
│
│   on .terraform/modules/base_system.karpenter.node_termination_queue/modules/karpenter/main.tf line 324, in resource "aws_iam_role_policy_attachment" "this":
│  324:   for_each = { for k, v in toset(compact([
│  325:     "${local.iam_role_policy_prefix}/AmazonEKSWorkerNodePolicy",
│  326:     "${local.iam_role_policy_prefix}/AmazonEC2ContainerRegistryReadOnly",
│  327:     var.iam_role_attach_cni_policy ? local.cni_policy : "",
│  328:   ])) : k => v if local.create_iam_role }
│     ├────────────────
│     │ local.cni_policy is a string, known only after apply
│     │ local.create_iam_role is false
│     │ local.iam_role_policy_prefix is a string, known only after apply
│     │ var.iam_role_attach_cni_policy is true
│
│ The "for_each" map includes keys derived from resource attributes that cannot be determined until apply, and so Terraform cannot determine the full set of keys
│ that will identify the instances of this resource.
│
│ When working with unknown values in for_each, it's better to define the map keys statically in your configuration and place apply-time results only in the map
│ values.
│
│ Alternatively, you could use the -target planning option to first apply only the resources that the for_each value depends on, and then apply a second time to
│ fully converge.

Terminal Output Screenshot(s)

Additional context

The text was updated successfully, but these errors were encountered:

o-sama · 2022-11-22T21:09:42Z

This is the current way this is done for other resources (Fargate in the example below):

resource "aws_iam_role_policy_attachment" "this" {
  for_each = local.create_iam_role ? toset(compact(distinct(concat([
    "${local.policy_arn_prefix}/AmazonEKSClusterPolicy",
    "${local.policy_arn_prefix}/AmazonEKSVPCResourceController",
  ], var.iam_role_additional_policies)))) : toset([])

  policy_arn = each.value
  role       = aws_iam_role.this[0].name
}

This is how it's done in the Karpenter module:

resource "aws_iam_role_policy_attachment" "this" {
  for_each = { for k, v in toset(compact([
    "${local.iam_role_policy_prefix}/AmazonEKSWorkerNodePolicy",
    "${local.iam_role_policy_prefix}/AmazonEC2ContainerRegistryReadOnly",
    var.iam_role_attach_cni_policy ? local.cni_policy : "",
  ])) : k => v if local.create_iam_role }

  policy_arn = each.value
  role       = aws_iam_role.this[0].name
}

Since the first example above works, maybe refactoring to match that is the way to go since it should fix the issue and keep uniformity across the codebase. Thoughts?

locals {
  # renaming this since it's just a selector
  cni_policy_select = var.cluster_ip_family == "ipv6" ? "${local.iam_role_policy_prefix}/AmazonEKS_CNI_IPv6_Policy" : "${local.iam_role_policy_prefix}/AmazonEKS_CNI_Policy"
  # this is now defined for us to add to the for_each value
  cni_policy             = var.iam_role_attach_cni_policy ? local.cni_policy_select : ""
}
# don't need distinct() here since toset should do that already
resource "aws_iam_role_policy_attachment" "this" {
  for_each = local.create_iam_role ? toset(compact(
    "${local.policy_arn_prefix}/AmazonEKSWorkerNodePolicy",
    "${local.policy_arn_prefix}/AmazonEC2ContainerRegistryReadOnly",
     local.cni_policy)) : toset([])

  policy_arn = each.value
  role       = aws_iam_role.this[0].name
}

Thoughts? It's a little more verbose but should work. I will try to test this out later today before making a PR.

RafPe · 2022-11-22T21:33:33Z

I can add another example which I just came across. Trying with just one input added - create so I could conditionally create the resources.

module "karpenter" {
  source  = "terraform-aws-modules/eks/aws//modules/karpenter"
  version = "v18.31.1"

  create                          = var.create
  cluster_name                    = module.eks.cluster_name
  irsa_oidc_provider_arn          = module.eks.oidc_provider_arn
  irsa_namespace_service_accounts = ["karpenter:karpenter"]
  create_iam_role                 = false
  iam_role_arn                    = module.sys_eks_managed_node_group.iam_role_arn
  tags                            = merge(local.tags, {})

}

results in

│ Error: Invalid index
│
│   on .terraform/modules/eks.karpenter/modules/karpenter/main.tf line 160, in resource "aws_iam_policy" "irsa":
│  160:   policy      = data.aws_iam_policy_document.irsa[0].json
│     ├────────────────
│     │ data.aws_iam_policy_document.irsa is empty tuple
│
│ The given key does not identify an element in this collection value: the collection has no elements.

Which gives indication that the conditionals on the module are not correct ( or am I missing something here ) ?

bryantbiggs · 2022-11-22T21:52:22Z

I'm not able to reproduce - this works:

module "karpenter" {
  source  = "terraform-aws-modules/eks/aws//modules/karpenter"
  version = "18.31.0"

  cluster_name = module.eks.cluster_name

  irsa_oidc_provider_arn          = module.eks.oidc_provider_arn
  irsa_namespace_service_accounts = ["karpenter:karpenter"]

  # Since Karpenter is running on an EKS Managed Node group,
  # we can re-use the role that was created for the node group
  create_iam_role = false
  iam_role_arn    = module.eks.eks_managed_node_groups["initial"].iam_role_arn
}

RafPe · 2022-11-22T21:53:13Z

@bryantbiggs please try with create = false to conditionally disable/enable the module

module "karpenter" {
  source  = "terraform-aws-modules/eks/aws//modules/karpenter"
  version = "18.31.0"

  create = false 
  cluster_name = module.eks.cluster_name

  irsa_oidc_provider_arn          = module.eks.oidc_provider_arn
  irsa_namespace_service_accounts = ["karpenter:karpenter"]

  # Since Karpenter is running on an EKS Managed Node group,
  # we can re-use the role that was created for the node group
  create_iam_role = false
  iam_role_arn    = module.eks.eks_managed_node_groups["initial"].iam_role_arn
}

FernandoMiguel · 2022-11-23T11:23:20Z

@bryantbiggs we already have karpenter in place, via blueprints.
all we want to use from this module, for now, is NTH.
That's why that is all I've enabled.

Is this a supported use of this module?

bryantbiggs · 2022-11-23T12:32:52Z

@bryantbiggs we already have karpenter in place, via blueprints. all we want to use from this module, for now, is NTH. That's why that is all I've enabled.

Is this a supported use of this module?

absolutely - should be patched shortly with #2308

antonbabenko · 2022-11-23T17:44:38Z

This issue has been resolved in version 18.31.2 🎉

FernandoMiguel · 2022-11-24T12:41:15Z

just tested this on 18.31.2 on a new cluster and on 1st apply it fails.

module "node_termination_queue" {
  source  = "terraform-aws-modules/eks/aws//modules/karpenter"
  version = ">= 18.31.2, ~> 18.31"

  cluster_name              = var.eks_cluster_id
  queue_name                = var.eks_cluster_id # Queue Name needs to match the cluster name do to IAM policies # A queue name is case-sensitive and can have up to 80 characters. You can use alphanumeric characters, hyphens (-), and underscores ( _ ). # Rule name can not be longer than 64, which limits cluster_name to 35 characters.
  enable_spot_termination   = true
  queue_managed_sse_enabled = true

  create_iam_role            = false
  create_instance_profile    = false
  create_irsa                = false
  iam_role_attach_cni_policy = false
}
output "node_termination_queue" { value = module.node_termination_queue }
output "node_termination_queue_queue_name" { value = module.node_termination_queue.queue_name }

╷
│ Error: Invalid for_each argument
│
│   on .terraform/modules/base_system.karpenter.node_termination_queue/modules/karpenter/main.tf line 324, in resource "aws_iam_role_policy_attachment" "this":
│  324:   for_each = { for k, v in toset(compact([
│  325:     "${local.iam_role_policy_prefix}/AmazonEKSWorkerNodePolicy",
│  326:     "${local.iam_role_policy_prefix}/AmazonEC2ContainerRegistryReadOnly",
│  327:     var.iam_role_attach_cni_policy ? local.cni_policy : "",
│  328:   ])) : k => v if local.create_iam_role }
│     ├────────────────
│     │ local.cni_policy is a string, known only after apply
│     │ local.create_iam_role is false
│     │ local.iam_role_policy_prefix is a string, known only after apply
│     │ var.iam_role_attach_cni_policy is false
│
│ The "for_each" map includes keys derived from resource attributes that cannot be determined until apply, and so Terraform cannot determine the full set of keys that will identify the instances of this resource.
│
│ When working with unknown values in for_each, it's better to define the map keys statically in your configuration and place apply-time results only in the map values.
│
│ Alternatively, you could use the -target planning option to first apply only the resources that the for_each value depends on, and then apply a second time to fully converge.

Please reopen this issue @antonbabenko

bryantbiggs · 2022-12-05T20:36:39Z

closing since I am unable to reproduce:

Reproduction

module "node_termination_queue" {
  source  = "terraform-aws-modules/eks/aws//modules/karpenter"
  version = ">= 18.31.2, ~> 18.31"

  cluster_name              = "example"
  queue_name                = "foo"
  enable_spot_termination   = true
  queue_managed_sse_enabled = true

  create_iam_role            = false
  create_instance_profile    = false
  create_irsa                = false
  iam_role_attach_cni_policy = false
}

output "node_termination_queue" { 
  value = module.node_termination_queue 
}
output "node_termination_queue_queue_name" { 
  value = module.node_termination_queue.queue_name 
}

Output

Apply complete! Resources: 10 added, 0 changed, 0 destroyed.

Outputs:

node_termination_queue = {
  "event_rules" = {
    "health_event" = {
      "arn" = "arn:aws:events:us-east-1:11111111111:rule/KarpenterHealthEvent-example"
      "description" = "Karpenter interrupt - AWS health event"
      "event_bus_name" = "default"
      "event_pattern" = "{\"detail-type\":[\"AWS Health Event\"],\"source\":[\"aws.health\"]}"
      "id" = "KarpenterHealthEvent-example"
      "is_enabled" = true
      "name" = "KarpenterHealthEvent-example"
      "name_prefix" = ""
      "role_arn" = ""
      "schedule_expression" = ""
      "tags" = tomap(null) /* of string */
      "tags_all" = tomap({})
    }
    "instance_rebalance" = {
      "arn" = "arn:aws:events:us-east-1:11111111111:rule/KarpenterInstanceRebalance-example"
      "description" = "Karpenter interrupt - EC2 instance rebalance recommendation"
      "event_bus_name" = "default"
      "event_pattern" = "{\"detail-type\":[\"EC2 Instance Rebalance Recommendation\"],\"source\":[\"aws.ec2\"]}"
      "id" = "KarpenterInstanceRebalance-example"
      "is_enabled" = true
      "name" = "KarpenterInstanceRebalance-example"
      "name_prefix" = ""
      "role_arn" = ""
      "schedule_expression" = ""
      "tags" = tomap(null) /* of string */
      "tags_all" = tomap({})
    }
    "instance_state_change" = {
      "arn" = "arn:aws:events:us-east-1:11111111111:rule/KarpenterInstanceStateChange-example"
      "description" = "Karpenter interrupt - EC2 instance state-change notification"
      "event_bus_name" = "default"
      "event_pattern" = "{\"detail-type\":[\"EC2 Instance State-change Notification\"],\"source\":[\"aws.ec2\"]}"
      "id" = "KarpenterInstanceStateChange-example"
      "is_enabled" = true
      "name" = "KarpenterInstanceStateChange-example"
      "name_prefix" = ""
      "role_arn" = ""
      "schedule_expression" = ""
      "tags" = tomap(null) /* of string */
      "tags_all" = tomap({})
    }
    "spot_interupt" = {
      "arn" = "arn:aws:events:us-east-1:11111111111:rule/KarpenterSpotInterrupt-example"
      "description" = "Karpenter interrupt - EC2 spot instance interruption warning"
      "event_bus_name" = "default"
      "event_pattern" = "{\"detail-type\":[\"EC2 Spot Instance Interruption Warning\"],\"source\":[\"aws.ec2\"]}"
      "id" = "KarpenterSpotInterrupt-example"
      "is_enabled" = true
      "name" = "KarpenterSpotInterrupt-example"
      "name_prefix" = ""
      "role_arn" = ""
      "schedule_expression" = ""
      "tags" = tomap(null) /* of string */
      "tags_all" = tomap({})
    }
  }
  "instance_profile_arn" = null
  "instance_profile_id" = null
  "instance_profile_name" = null
  "instance_profile_unique" = null
  "irsa_arn" = null
  "irsa_name" = null
  "irsa_unique_id" = null
  "queue_arn" = "arn:aws:sqs:us-east-1:11111111111:foo"
  "queue_name" = "foo"
  "queue_url" = "https://sqs.us-east-1.amazonaws.com/660548353186/foo"
  "role_arn" = tostring(null)
  "role_name" = null
  "role_unique_id" = null
}
node_termination_queue_queue_name = "foo"

FernandoMiguel · 2022-12-05T20:51:21Z

I'll retest tomorrow.

FernandoMiguel · 2022-12-23T11:54:44Z

just tested on a new cluster, using v19.4

module "node_termination_queue" {
  source  = "terraform-aws-modules/eks/aws//modules/karpenter"
  version = "~> 19.4"

  cluster_name = var.eks_cluster_id
  # queue_name   = var.eks_cluster_id
  # Queue Name needs to match the cluster name do to IAM policies
  # A queue name is case-sensitive and can have up to 80 characters. You can use alphanumeric characters, hyphens (-), and underscores ( _ ).
  # Rule name can not be longer than 64, which limits cluster_name to 35 characters.

  enable_spot_termination   = true
  queue_managed_sse_enabled = true

  create_iam_role            = false
  create_instance_profile    = false
  create_irsa                = false
  iam_role_attach_cni_policy = false
}
output "node_termination_queue" { value = module.node_termination_queue }
output "node_termination_queue_queue_name" { value = module.node_termination_queue.queue_name }

│ Error: Invalid for_each argument
│
│   on .terraform/modules/base_system.karpenter.node_termination_queue/modules/karpenter/main.tf line 327, in resource "aws_iam_role_policy_attachment" "this":
│  327:   for_each = { for k, v in toset(compact([
│  328:     "${local.iam_role_policy_prefix}/AmazonEKSWorkerNodePolicy",
│  329:     "${local.iam_role_policy_prefix}/AmazonEC2ContainerRegistryReadOnly",
│  330:     var.iam_role_attach_cni_policy ? local.cni_policy : "",
│  331:   ])) : k => v if local.create_iam_role }
│     ├────────────────
│     │ local.cni_policy is a string, known only after apply
│     │ local.create_iam_role is false
│     │ local.iam_role_policy_prefix is a string, known only after apply
│     │ var.iam_role_attach_cni_policy is false
│
│ The "for_each" map includes keys derived from resource attributes that cannot be determined until apply, and so Terraform cannot determine the full set of keys that will identify the instances of this resource.
│
│ When working with unknown values in for_each, it's better to define the map keys statically in your configuration and place apply-time results only in the map values.
│
│ Alternatively, you could use the -target planning option to first apply only the resources that the for_each value depends on, and then apply a second time to fully converge.

github-actions · 2023-01-23T02:13:31Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

bryantbiggs mentioned this issue Nov 22, 2022

fix: Ensure that var.create is tied to all resources correctly #2308

Merged

3 tasks

bryantbiggs closed this as completed in #2308 Nov 23, 2022

antonbabenko reopened this Nov 26, 2022

bryantbiggs closed this as completed Dec 5, 2022

github-actions bot locked as resolved and limited conversation to collaborators Jan 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

new karpenter policy fails #2306

new karpenter policy fails #2306

FernandoMiguel commented Nov 22, 2022

o-sama commented Nov 22, 2022 •

edited

Loading

RafPe commented Nov 22, 2022

bryantbiggs commented Nov 22, 2022

RafPe commented Nov 22, 2022 •

edited

Loading

FernandoMiguel commented Nov 23, 2022

bryantbiggs commented Nov 23, 2022

antonbabenko commented Nov 23, 2022

FernandoMiguel commented Nov 24, 2022

bryantbiggs commented Dec 5, 2022

FernandoMiguel commented Dec 5, 2022

FernandoMiguel commented Dec 23, 2022

github-actions bot commented Jan 23, 2023

new karpenter policy fails #2306

new karpenter policy fails #2306

Comments

FernandoMiguel commented Nov 22, 2022

Description

⚠️ Note

Versions

Reproduction Code [Required]

Expected behavior

Actual behavior

Terminal Output Screenshot(s)

Additional context

o-sama commented Nov 22, 2022 • edited Loading

RafPe commented Nov 22, 2022

bryantbiggs commented Nov 22, 2022

RafPe commented Nov 22, 2022 • edited Loading

FernandoMiguel commented Nov 23, 2022

bryantbiggs commented Nov 23, 2022

antonbabenko commented Nov 23, 2022

FernandoMiguel commented Nov 24, 2022

bryantbiggs commented Dec 5, 2022

Reproduction

Output

FernandoMiguel commented Dec 5, 2022

FernandoMiguel commented Dec 23, 2022

github-actions bot commented Jan 23, 2023

o-sama commented Nov 22, 2022 •

edited

Loading

RafPe commented Nov 22, 2022 •

edited

Loading