Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nodes fail to join cluster and CNI plugin not being added #1972

Closed
evenme opened this issue Mar 28, 2022 · 41 comments
Closed

Nodes fail to join cluster and CNI plugin not being added #1972

evenme opened this issue Mar 28, 2022 · 41 comments

Comments

@evenme
Copy link

evenme commented Mar 28, 2022

Description

Using the example/managed_node_group as the base, I'm creating a 2 nodes 1.18 cluster with 1 managed node group only, however the node group is created without the CNI plugin and the nodes are created but won't join the cluster due to:

Mar 28 17:36:43 ip-10-121-83-161.ec2.internal kubelet[3639]: W0328 17:36:43.574496    3639 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Mar 28 17:36:45 ip-10-121-83-161.ec2.internal kubelet[3639]: E0328 17:36:45.126854    3639 kubelet.go:2217] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config
 uninitialized

The nodes are left as NotReady and terraform error is this:

Error: error waiting for EKS Node Group (pixlee-staging-eks:pixlee-staging-eks-od-nodegroup-20220322160025474300000001) to create: unexpected state 'CREATE_FAILED', wanted target 'ACTIVE'. last error: 1 error occurred: * i-05da0ff8c2fbc0400, i-0d4c5acd73dfbd105: NodeCreationFailure: Unhealthy nodes in the kubernetes cluster
with module.eks.module.eks_managed_node_group["ondemand"].aws_eks_node_group.this[0]
on .terraform/modules/eks/modules/eks-managed-node-group/main.tf line 269, in resource "aws_eks_node_group" "this":
resource "aws_eks_node_group" "this" {

Versions

  • Module version [Required]: v18

  • Terraform version:
    ❯ terraform providers -version
    Terraform v1.1.5
    on darwin_amd64

  • provider registry.terraform.io/gavinbunney/kubectl v1.13.1
  • provider registry.terraform.io/hashicorp/aws v3.75.0
  • provider registry.terraform.io/hashicorp/cloudinit v2.2.0
  • provider registry.terraform.io/hashicorp/kubernetes v1.13.4
  • provider registry.terraform.io/hashicorp/null v3.1.1
  • provider registry.terraform.io/hashicorp/tfe v0.25.3
  • provider registry.terraform.io/hashicorp/tls v2.2.0
  • provider registry.terraform.io/hashicorp/vault v2.24.1
  • provider registry.terraform.io/terraform-aws-modules/http v2.4.1- Provider version(s):

Reproduction Code [Required]

module "eks" {
  source                          = "terraform-aws-modules/eks/aws"
  cluster_name                    = local.cluster_name
  cluster_version                 = var.cluster_version
  cluster_endpoint_private_access = true
  cluster_endpoint_public_access  = true
  vpc_id                          = local.vpc_id
  subnet_ids                      = local.private_subnets
  tags                            = var.tags

  cluster_addons = {
    coredns = {
      resolve_conflicts = "OVERWRITE"
    }
    kube-proxy = {}
    vpc-cni = {
      resolve_conflicts        = "OVERWRITE"
      service_account_role_arn = module.vpc_cni_irsa.iam_role_arn
    }
  }

  eks_managed_node_group_defaults = {
    ami_type                 = "AL2_x86_64"
    disk_size                = 100
    instance_types           = ["t3.medium", "t3.large", "t3.2xlarge"]
    force_update_version     = true
    capacity_type            = "ON_DEMAND"
    enable_monitoring        = true
    create_iam_role          = true
    iam_role_name            = "${local.cluster_name}-managed-node-group-role"
    iam_role_use_name_prefix = false
    create_launch_template   = false
    launch_template_name     = ""
    remote_access = {
      ec2_ssh_key               = local.key_name
      source_security_group_ids = [aws_security_group.remote_access.id]
    }
    iam_role_attach_cni_policy = true
  }

  eks_managed_node_groups = {
    ondemand = {
      name           = "${local.cluster_name}-nodegroup"
      subnet_ids     = local.private_subnets
      desired_size   = var.ondemand_min_instances
      min_size       = var.ondemand_min_instances
      max_size       = var.ondemand_max_instances
      instance_types = var.ondemand_instance_types
      update_config = {
        max_unavailable_percentage = 50 # or set `max_unavailable`
      }
    }
  }
}

module "vpc_cni_irsa" {
  source  = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
  version = "~> 4.0"

  role_name_prefix      = "VPC-CNI-IRSA"
  attach_vpc_cni_policy = true
  vpc_cni_enable_ipv4   = true

  oidc_providers = {
    main = {
      provider_arn               = module.eks.oidc_provider_arn
      namespace_service_accounts = ["kube-system:aws-node"]
    }
  }

  tags = var.tags
}

Steps to reproduce the behavior:

Using terraform cloud to plan and run it.

Expected behavior

Nodegroup created with coredns, kube-proxy and vpc-cni add-ons active + nodes joining the cluster successfully.

Actual behavior

No vpc-cni plugin added, nodes with NodeCreationFailure | Unhealthy nodes in the kubernetes cluster.

@mghantous
Copy link

We are experiencing this too for the first time today. The Ready: False messsage is:

container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

@mghantous
Copy link

I've found a workaround that is a bit tedious and not really ideal.

First comment out eks_managed_node_groups and any cluster_addons besides vpc-cni and kube-proxy, then terraform init, terraform apply. Once the vpc-cni add-on is active, uncomment everything, terraform init, terraform apply again. The node groups will then create successfully.

@jam01
Copy link

jam01 commented Mar 29, 2022

@mghantous we're facing a similar issue, and indeed commenting out the core DNS add-on and managed node group allowed the cluster to finally provision correctly. Adding back the core DNS and node group still failed for us though.

Are you also provisioning a private (no NATs) cluster? If so are there any considerations for the private endpoints maybe?

@mghantous
Copy link

We are provisioning private subnets, so I am not sure why it failed for you. Maybe check in the aws console to see if you have any new error messages under Node Conditions or any of the Add-ons? I am guessing it is probably not the "cni config uninitialized" error message I was seeing if your vpc-cni addon is active and ready.

Navigate to node conditions:
Configuration -> Compute -> <Node Group Name> -> Nodes -> <Node name> -> Conditions

Navigate to add-ons
Configuration -> Add-ons -> coredns -> Health Issues
Configuration -> Add-ons -> vpc-cni -> Health Issues

@evenme
Copy link
Author

evenme commented Mar 29, 2022

We're provisioning to private subnets too and worse of it is even if I manually add a managed node group to the provisioned cluster with manually added vpc-cni, the nodes won't show up at the node group (though they are now healthy). There's something missing I'm not picking up here.

@bryantbiggs
Copy link
Member

what do you mean by

however the node group is created without the CNI plugin

By default, all EKS clusters are provisioned with CoreDNS, VPC CNI, and kube-proxy pods in order to bootstrap the cluster properly. Even if you do not enable any addons, these services are scheduled to run once nodes are provisioned

Reference: aws/containers-roadmap#923

@evenme
Copy link
Author

evenme commented Mar 29, 2022

By default, all EKS clusters are provisioned with CoreDNS, VPC CNI, and kube-proxy pods in order to bootstrap the cluster properly. Even if you do not enable any addons, these services are scheduled to run once nodes are provisioned

Reference: aws/containers-roadmap#923

After the cluster is created, nodes are to be added but they don't, they get unhealthy status and won't join the nodegroup. Then, when checking the add-ons, kube-proxy and coredns are there but vpc-cni is not. Manually adding the vpc-cni and re-running terraform fails saying vpc-cni add-ons is already there (Error: error creating EKS Add-On (pixlee-staging-eks:vpc-cni): ResourceInUseException: Addon already exists.).

@bryantbiggs
Copy link
Member

have you tried deploying example/managed_node_group as is? with the variables in your reproduction, its hard to try to deploy what you current have to take a deeper look

@mghantous
Copy link

mghantous commented Mar 29, 2022

what do you mean by

however the node group is created without the CNI plugin

By default, all EKS clusters are provisioned with CoreDNS, VPC CNI, and kube-proxy pods in order to bootstrap the cluster properly. Even if you do not enable any addons, these services are scheduled to run once nodes are provisioned

Reference: aws/containers-roadmap#923

Is it possible that this is something that broke with eks.5 (released March 10)? It seems at least the vpc-cni add-on is required now although I don't know of any docs that support that theory, and I don't see a way to set cluster_platform_version = "eks.4" in terraform or the aws console to check this.

https://docs.aws.amazon.com/eks/latest/userguide/platform-versions.html

@mghantous
Copy link

I meant to mention, it seems you can repro this outside of terrafrom, just in the aws console by deleting vpc-cni add-on if you have one and then trying to add a nodegroup.

@evenme
Copy link
Author

evenme commented Mar 31, 2022

have you tried deploying example/managed_node_group as is? with the variables in your reproduction, its hard to try to deploy what you current have to take a deeper look

Just did this now, only used the 'complete' node group (removed the others), a few optional portions commented out but pretty much all else 'as is'. Same issue:

Error: error waiting for EKS Node Group (***-eks:complete-eks-mng-20220331183923046300000001) to create: unexpected state 'CREATE_FAILED', wanted target 'ACTIVE'. last error: 1 error occurred:
│       * i-*****21a, i-*****b7c: NodeCreationFailure: Instances failed to join the kubernetes cluster

│   with module.eks.module.eks_managed_node_group["complete"].aws_eks_node_group.this[0],
│   on .terraform/modules/eks/modules/eks-managed-node-group/main.tf line 269, in resource "aws_eks_node_group" "this":
│  269: resource "aws_eks_node_group" "this" {

Same thing, cluster is created, node group is created but instances won't join the cluster. Also NO add-ons were added to the cluster.

@bhuisgen
Copy link

Same problem here with a public+private cluster v1.21 without this TF module, the nodes won't join. In fact the docker images are not pulled for unknown reason (DNS resolution check ok, VPC endpoint to ECR ok, cluster endpoint ok, role ok).

But with a private only cluster, the nodes have joined.

@anbotero
Copy link

anbotero commented Apr 1, 2022

Just hit this. I can also replicate it without TF. Just for testing, I just gave the node role AdministratorAccess and after a moment node registered. I tried giving it the ec2:DescribeNetworkInterfaces since I saw it in an error, but it didn't work. Only after the AdministratorAccess.

I can safely say it's not just that zone:
image

Still, it works just fine as soon as I slap the role with AdministratorAccess. Logs don't show any other required permission except ec2:DescribeNetworkInterfaces.

@anbotero
Copy link

anbotero commented Apr 1, 2022

I've found a workaround that is a bit tedious and not really ideal.

First comment out eks_managed_node_groups and any cluster_addons besides vpc-cni and kube-proxy, then terraform init, terraform apply. Once the vpc-cni add-on is active, uncomment everything, terraform init, terraform apply again. The node groups will then create successfully.

Following this (albeit it feels bad doing sequential steps in templates) worked for me, thanks!

Steps:

  • In your templates, comment coredns Addon definition, and any node(group) creation
  • Run your plan and apply it
  • Add coredns Addon, add nodegroup definition, and enable iam_role_attach_cni_policy = true in the Node group definition
  • Run your plan and apply it
  • Disable iam_role_attach_cni_policy = false and then add your VPC CNI IRSA module:
module "vpc_cni_irsa" {
  source  = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"

  role_name             = "vpc-cni"
  attach_vpc_cni_policy = true
  vpc_cni_enable_ipv4   = true

  oidc_providers = {
    main = {
      provider_arn               = module.eks.oidc_provider_arn
      namespace_service_accounts = ["kube-system:aws-node"]
    }
  }

  tags = local.tags
  }
}

@farrukh90
Copy link

This is still an issue, did anyone find a work around besides the one shown above?

@bryantbiggs
Copy link
Member

I am not able to reproduce with the examples we have here in this project. The information I am seeing is mixed and sporadic - hard to piece together a reproduction.

Also, the screenshot above by @anbotero shows an issue in the region which is not related to the module

@farrukh90
Copy link

farrukh90 commented Apr 1, 2022

Here, maybe this code can help you reproduce


variable "vpc_config" {
  type = map(any)
  default = {
    region          = "region"
    cluster_version = "1.19"
    cluster_name    = "name
    instance_type   = "t2.large"
    asg_max_size  = 10
    asg_min_size  = 3
    asg_desired_capacity = 3
    vpc_id   = "vpc-"
    subnet1  = "subnet-"
    subnet2  = "subnet-"
    subnet3  = "subnet-"
  }
}
data "aws_eks_cluster" "cluster" {
  name = module.my-cluster.cluster_id
}

data "aws_eks_cluster_auth" "cluster" {
  name = module.my-cluster.cluster_id
}

provider "kubernetes" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
  token                  = data.aws_eks_cluster_auth.cluster.token
  load_config_file       = false
  version                = "~> 1.9"
}

module "my-cluster" {
  source          = "terraform-aws-modules/eks/aws"
  version         = "17.1.0"
  cluster_name    = "${var.vpc_config["cluster_name"]}"
  cluster_version = var.vpc_config["cluster_version"]
  subnets         = [var.vpc_config["subnet1"], var.vpc_config["subnet2"], var.vpc_config["subnet3"]]
  vpc_id          = var.vpc_config["vpc_id"]
  worker_additional_security_group_ids			= [security_group_id]


  worker_groups = [
    {
      instance_type = var.vpc_config["instance_type"]
      asg_max_size  = var.vpc_config["asg_max_size"]
      asg_min_size  = var.vpc_config["asg_min_size"]
      asg_desired_capacity  = var.vpc_config["asg_desired_capacity"]
      key_name        = aws_key_pair.generated_key.key_name
      create_launch_template = true

    }
  ]
}

It creates instance group, and the EKS. But instance group does not join EKS.

@bryantbiggs
Copy link
Member

@farrukh90 thats v17 though - we're on v18

@mghantous
Copy link

@bryantbiggs if you delete the vpc-cni add-on manually and then try to add a nodegroup, you do not have any issue with the nodes joining? I know some of the reports are sporadic, but that is what I am seeing. Maybe the order of operations matters even though those are outside of our control. For example if your vpc-cni add-on applies before the nodegroup applies, you are ok, but if it doesn't you have the issue?

@bryantbiggs
Copy link
Member

@bryantbiggs if you delete the vpc-cni add-on manually and then try to add a nodegroup, you do not have any issue with the nodes joining? I know some of the reports are sporadic, but that is what I am seeing. Maybe the order of operations matters even though those are outside of our control. For example if your vpc-cni add-on applies before the nodegroup applies, you are ok, but if it doesn't you have the issue?

if you delete the vpc-cni, nodes won't join because the pod networking is gone and pods/nodes won't be able to connect with the control plane. You need a network plugin running for nodes to register with the control plane

@evenme
Copy link
Author

evenme commented Apr 1, 2022

This is the code I used, very little changed from the example/managed_nodegroup (only the complete part, as I don't need/use bottlerocket or containerd or the custom_ami):

module "eks" {
  create                          = true
  source                          = "terraform-aws-modules/eks/aws"
  cluster_name                    = local.cluster_name
  cluster_version                 = var.cluster_version
  cluster_endpoint_private_access = true
  cluster_endpoint_public_access  = true
  vpc_id                          = local.vpc_id
  subnet_ids                      = local.private_subnets
  tags                            = var.tags


  # IPV4
  cluster_ip_family = "ipv4"

  cluster_addons = {
    coredns = {
      resolve_conflicts = "OVERWRITE"
    }
    kube-proxy = {}
    vpc-cni = {
      resolve_conflicts        = "OVERWRITE"
      service_account_role_arn = module.vpc_cni_irsa.iam_role_arn
    }
  }

  cluster_encryption_config = [{
    provider_key_arn = aws_kms_key.eks.arn
    resources        = ["secrets"]
  }]

  # # Extend cluster security group rules
  cluster_security_group_additional_rules = {
    egress_nodes_ephemeral_ports_tcp = {
      description                = "To node 1025-65535"
      protocol                   = "tcp"
      from_port                  = 1025
      to_port                    = 65535
      type                       = "egress"
      source_node_security_group = true
    }
  }

  # Extend node-to-node security group rules
  node_security_group_additional_rules = {
    ingress_self_all = {
      description = "Node to node all ports/protocols"
      protocol    = "-1"
      from_port   = 0
      to_port     = 0
      type        = "ingress"
      self        = true
    }
    egress_all = {
      description      = "Node all egress"
      protocol         = "-1"
      from_port        = 0
      to_port          = 0
      type             = "egress"
      cidr_blocks      = ["0.0.0.0/0"]
      ipv6_cidr_blocks = ["::/0"]
    }
  }

  eks_managed_node_group_defaults = {
    ami_type       = "AL2_x86_64"
    disk_size      = 50
    instance_types = var.ondemand_instance_types

    # We are using the IRSA created below for permissions
    # However, we have to deploy with the policy attached FIRST (when creating a fresh cluster)
    # and then turn this off after the cluster/node group is created. Without this initial policy,
    # the VPC CNI fails to assign IPs and nodes cannot join the cluster
    # See https://github.com/aws/containers-roadmap/issues/1666 for more context
    iam_role_attach_cni_policy = true
  }

  eks_managed_node_groups = {
    # Complete
    complete = {
      name            = "complete-eks-mng"
      use_name_prefix = true

      subnet_ids = local.private_subnets

      min_size     = var.ondemand_min_instances
      max_size     = var.ondemand_max_instances
      desired_size = var.ondemand_min_instances

      ami_id = data.aws_ami.eks_default.image_id

      capacity_type        = "ON_DEMAND"
      force_update_version = true
      labels               = local.k8s_labels

      update_config = {
        max_unavailable_percentage = 50 # or set `max_unavailable`
      }

      description = "EKS managed node group example launch template"

      ebs_optimized           = true
      vpc_security_group_ids  = [aws_security_group.additional.id]
      disable_api_termination = false
      enable_monitoring       = true

      create_iam_role          = true
      iam_role_name            = "${local.cluster_name}-managed-node-group-complete-example"
      iam_role_use_name_prefix = false
      iam_role_description     = "${local.cluster_name}-EKS managed node group complete example role"
      iam_role_additional_policies = [
        "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
      ]

      create_security_group          = true
      security_group_name            = "${local.cluster_name}-eks-managed-node-group-complete-example"
      security_group_use_name_prefix = false
      security_group_description     = "${local.cluster_name}-EKS managed node group complete example security group"
      security_group_rules = {
        phoneOut = {
          description = "Hello CloudFlare"
          protocol    = "udp"
          from_port   = 53
          to_port     = 53
          type        = "egress"
          cidr_blocks = ["1.1.1.1/32"]
        }
        phoneHome = {
          description                   = "Hello cluster"
          protocol                      = "udp"
          from_port                     = 53
          to_port                       = 53
          type                          = "egress"
          source_cluster_security_group = true # bit of reflection lookup
        }
      }
      tags = local.node_tags
    }
  }
}

resource "aws_iam_role_policy_attachment" "additional" {
  for_each = module.eks.eks_managed_node_groups

  policy_arn = aws_iam_policy.node_additional.arn
  role       = each.value.iam_role_name
}

resource "aws_iam_policy" "node_additional" {
  name        = "${local.cluster_name}-additional"
  description = "Example usage of node additional policy"

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = [
          "ec2:Describe*",
        ]
        Effect   = "Allow"
        Resource = "*"
      },
    ]
  })

  tags = var.tags
}

locals {
  kubeconfig = yamlencode({
    apiVersion      = "v1"
    kind            = "Config"
    current-context = "terraform"
    clusters = [{
      name = module.eks.cluster_id
      cluster = {
        certificate-authority-data = module.eks.cluster_certificate_authority_data
        server                     = module.eks.cluster_endpoint
      }
    }]
    contexts = [{
      name = "terraform"
      context = {
        cluster = module.eks.cluster_id
        user    = "terraform"
      }
    }]
    users = [{
      name = "terraform"
      user = {
        token = data.aws_eks_cluster_auth.cluster.token
      }
    }]
  })
}

resource "null_resource" "patch" {
  triggers = {
    kubeconfig = base64encode(local.kubeconfig)
    cmd_patch  = "kubectl patch configmap/aws-auth --patch \"${module.eks.aws_auth_configmap_yaml}\" -n kube-system --kubeconfig <(echo $KUBECONFIG | base64 --decode)"
  }

  provisioner "local-exec" {
    interpreter = ["/bin/bash", "-c"]
    environment = {
      KUBECONFIG = self.triggers.kubeconfig
    }
    command = self.triggers.cmd_patch
  }
}

module "vpc_cni_irsa" {
  source  = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
  version = "~> 4.12"

  role_name_prefix      = "VPC-CNI-IRSA"
  attach_vpc_cni_policy = true
  vpc_cni_enable_ipv6   = true

  oidc_providers = {
    main = {
      provider_arn               = module.eks.oidc_provider_arn
      namespace_service_accounts = ["kube-system:aws-node"]
    }
  }

  tags = var.tags
}


resource "aws_security_group" "remote_access" {
  name_prefix = "${local.cluster_name}-remote-access"
  description = "Allow remote SSH access"
  vpc_id      = local.vpc_id

  ingress {
    description = "SSH access"
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["10.0.0.0/8"]
  }

  egress {
    from_port        = 0
    to_port          = 0
    protocol         = "-1"
    cidr_blocks      = ["0.0.0.0/0"]
    ipv6_cidr_blocks = ["::/0"]
  }

  tags = var.tags
}



resource "aws_kms_key" "eks" {
  description             = "EKS Secret Encryption Key"
  deletion_window_in_days = 7
  enable_key_rotation     = true

  tags = var.tags
}

resource "aws_kms_key" "ebs" {
  description             = "Customer managed key to encrypt EKS managed node group volumes"
  deletion_window_in_days = 7
  policy                  = data.aws_iam_policy_document.ebs.json
}

# This policy is required for the KMS key used for EKS root volumes, so the cluster is allowed to enc/dec/attach encrypted EBS volumes
data "aws_iam_policy_document" "ebs" {
  # Copy of default KMS policy that lets you manage it
  statement {
    sid       = "Enable IAM User Permissions"
    actions   = ["kms:*"]
    resources = ["*"]

    principals {
      type        = "AWS"
      identifiers = ["arn:aws:iam::${data.aws_caller_identity.current.account_id}:root"]
    }
  }

  # Required for EKS
  statement {
    sid = "Allow service-linked role use of the CMK"
    actions = [
      "kms:Encrypt",
      "kms:Decrypt",
      "kms:ReEncrypt*",
      "kms:GenerateDataKey*",
      "kms:DescribeKey"
    ]
    resources = ["*"]

    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling", # required for the ASG to manage encrypted volumes for nodes
        module.eks.cluster_iam_role_arn,                                                                                                            # required for the cluster / persistentvolume-controller to create encrypted PVCs
      ]
    }
  }

  statement {
    sid       = "Allow attachment of persistent resources"
    actions   = ["kms:CreateGrant"]
    resources = ["*"]

    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling", # required for the ASG to manage encrypted volumes for nodes
        module.eks.cluster_iam_role_arn,                                                                                                            # required for the cluster / persistentvolume-controller to create encrypted PVCs
      ]
    }

    condition {
      test     = "Bool"
      variable = "kms:GrantIsForAWSResource"
      values   = ["true"]
    }
  }
}


resource "aws_security_group" "additional" {
    name_prefix = "${local.cluster_name}-additional"
    vpc_id      = local.vpc_id
  
    ingress {
      from_port = 22
      to_port   = 22
      protocol  = "tcp"
      cidr_blocks = [
        "10.0.0.0/8",
        "172.16.0.0/12",
        "192.168.0.0/16",
      ]
    }
  
    tags = var.tags
}

This created the cluster but failed to have any add-ons. Commenting the eks_managed_node_groups portion plus coredns, made vpc_cni and kube-proxy get deployed but running again with eks_managed_node_groups and coredns created the nodegroup but instances failed to join the cluster. I was using 1.18 version (deployed this yesterday).

@mghantous
Copy link

@bryantbiggs if you delete the vpc-cni add-on manually and then try to add a nodegroup, you do not have any issue with the nodes joining? I know some of the reports are sporadic, but that is what I am seeing. Maybe the order of operations matters even though those are outside of our control. For example if your vpc-cni add-on applies before the nodegroup applies, you are ok, but if it doesn't you have the issue?

if you delete the vpc-cni, nodes won't join because the pod networking is gone and pods/nodes won't be able to connect with the control plane. You need a network plugin running for nodes to register with the control plane

So how do I ensure terraform will apply the vpc-cni addon first before the nodegroup is created? It is not doing that, so applying the plan fails.

@bryantbiggs
Copy link
Member

@bryantbiggs if you delete the vpc-cni add-on manually and then try to add a nodegroup, you do not have any issue with the nodes joining? I know some of the reports are sporadic, but that is what I am seeing. Maybe the order of operations matters even though those are outside of our control. For example if your vpc-cni add-on applies before the nodegroup applies, you are ok, but if it doesn't you have the issue?

if you delete the vpc-cni, nodes won't join because the pod networking is gone and pods/nodes won't be able to connect with the control plane. You need a network plugin running for nodes to register with the control plane

So how do I ensure terraform will apply the vpc-cni addon first before the nodegroup is created? It is not doing that, so applying the plan fails.

I don't follow. By default on every EKS cluster, the vpc-cni is provisioned out of the box. When you opt in to using the EKS addon, you are just taking control over its configuration/management - basically "adopting" the vpc-cni config that is on the cluster so you can make changes or remove if you want. If you don't have any addons, the vpc-cni still will be running in the cluster.

@mghantous
Copy link

mghantous commented Apr 1, 2022

If you don't have any addons, the vpc-cni still will be running in the cluster.

So I think that is my problem. It for some reason is not running without the addon.

@evenme
Copy link
Author

evenme commented Apr 1, 2022

@bryantbiggs if you delete the vpc-cni add-on manually and then try to add a nodegroup, you do not have any issue with the nodes joining? I know some of the reports are sporadic, but that is what I am seeing. Maybe the order of operations matters even though those are outside of our control. For example if your vpc-cni add-on applies before the nodegroup applies, you are ok, but if it doesn't you have the issue?

if you delete the vpc-cni, nodes won't join because the pod networking is gone and pods/nodes won't be able to connect with the control plane. You need a network plugin running for nodes to register with the control plane

So how do I ensure terraform will apply the vpc-cni addon first before the nodegroup is created? It is not doing that, so applying the plan fails.

I don't follow. By default on every EKS cluster, the vpc-cni is provisioned out of the box. When you opt in to using the EKS addon, you are just taking control over its configuration/management - basically "adopting" the vpc-cni config that is on the cluster so you can make changes or remove if you want. If you don't have any addons, the vpc-cni still will be running in the cluster.

While trying this for over 2 weeks, I can say that for this issue, the vpc-cni is NOT being created before creating the node_groups. But even after the vpc-cni gets created somehow (either manually or running terraform without the nodegroup, with just vpc-cni and kube-proxy), the nodes are not joining the cluster.

@bryantbiggs
Copy link
Member

This is the code I used, very little changed from the example/managed_nodegroup (only the complete part, as I don't need/use bottlerocket or containerd or the custom_ami):

I can't repro because I don't know what the variables are

@bryantbiggs
Copy link
Member

@bryantbiggs if you delete the vpc-cni add-on manually and then try to add a nodegroup, you do not have any issue with the nodes joining? I know some of the reports are sporadic, but that is what I am seeing. Maybe the order of operations matters even though those are outside of our control. For example if your vpc-cni add-on applies before the nodegroup applies, you are ok, but if it doesn't you have the issue?

if you delete the vpc-cni, nodes won't join because the pod networking is gone and pods/nodes won't be able to connect with the control plane. You need a network plugin running for nodes to register with the control plane

So how do I ensure terraform will apply the vpc-cni addon first before the nodegroup is created? It is not doing that, so applying the plan fails.

I don't follow. By default on every EKS cluster, the vpc-cni is provisioned out of the box. When you opt in to using the EKS addon, you are just taking control over its configuration/management - basically "adopting" the vpc-cni config that is on the cluster so you can make changes or remove if you want. If you don't have any addons, the vpc-cni still will be running in the cluster.

While trying this for over 2 weeks, I can say that for this issue, the vpc-cni is NOT being created before creating the node_groups. But even after the vpc-cni gets created somehow (either manually or running terraform without the nodegroup, with just vpc-cni and kube-proxy), the nodes are not joining the cluster.

I'm sorry - I really don't follow what you are saying here. It would be REALLY helpful to have

  1. Here is a fully deployable code repro (no variables, no locals, etc.)
  2. First I did this (i.e. - terraform init)
  3. Then I did this (i.e. - terraform apply)
  4. ...

@evenme
Copy link
Author

evenme commented Apr 1, 2022

This is the code I used, very little changed from the example/managed_nodegroup (only the complete part, as I don't need/use bottlerocket or containerd or the custom_ami):

I can't repro because I don't know what the variables are

I'm using an existing vpc (basically getting from data "terraform_remote_state"), one that already has a cluster working. Can I give you the code without variables, except for the vpc/subnets portion?

@bryantbiggs
Copy link
Member

This is the code I used, very little changed from the example/managed_nodegroup (only the complete part, as I don't need/use bottlerocket or containerd or the custom_ami):

I can't repro because I don't know what the variables are

I'm using an existing vpc (basically getting from data "terraform_remote_state"), one that already has a cluster working. Can I give you the code without variables, except for the vpc/subnets portion?

Not really because VPC networking can have a big effect on this and not knowing how your VPC is setup won't help much. You can take one of the examples - copy+paste it somewhere, modify it to match your setup, deploy it and ensure you are seeing the same issue, then paste it here.

@mghantous
Copy link

mghantous commented Apr 1, 2022

Sorry @bryantbiggs I am trying to understand these two statements

If you don't have any addons, the vpc-cni still will be running in the cluster.

Ok. It is not for me for some reason, but it makes sense that it is supposed to work like that.

if you delete the vpc-cni, nodes won't join because the pod networking is gone and pods/nodes won't be able to connect with the control plane. You need a network plugin running for nodes to register with the control plane

That sounds contradictory because you are saying the addon is only for "taking control over its configuration/management". So if I delete it, shouldn't nodes still work? Or once it's added I can no longer go back to deleting it?

Maybe it is because I do not have a CNI policy attached to the nodegroup role? Only the vpc-cni addon service account.

@bryantbiggs
Copy link
Member

Sorry @bryantbiggs I am trying to understand these two statements

If you don't have any addons, the vpc-cni still will be running in the cluster.

Ok. It is not for me for some reason, but it makes sense that it is supposed to work like that.

if you delete the vpc-cni, nodes won't join because the pod networking is gone and pods/nodes won't be able to connect with the control plane. You need a network plugin running for nodes to register with the control plane

That sounds contradictory because you are saying the addon is only for "taking control over its configuration/management". So if I delete it, shouldn't nodes still work? Or once it's added I can no longer go back to deleting it?

Maybe it is because I do not have a CNI policy attached to the nodegroup role? Only the vpc-cni addon service account.

apologies but I am going to have to refer back to here #1972 (comment)

@bryantbiggs
Copy link
Member

@evenme did you deploy that config and verify it reproduces the issue?

@bryantbiggs
Copy link
Member

You are specifying an AMI which means you now need to provide the bootstrap user data. You can:

  1. Not supply the AMI ID
  2. Have the module add the bootstrap user data with enable_bootstrap_user_data = true reference

@evenme
Copy link
Author

evenme commented Apr 1, 2022

@evenme did you deploy that config and verify it reproduces the issue?

You are specifying an AMI which means you now need to provide the bootstrap user data. You can:

  1. Not supply the AMI ID
  2. Have the module add the bootstrap user data with enable_bootstrap_user_data = true reference

Doesn't the data filter gets the right ami, like in the example? At least the deployed code does create the instances with the right ami, they just hang there alone without joining the cluster.

@bryantbiggs
Copy link
Member

@evenme did you deploy that config and verify it reproduces the issue?

You are specifying an AMI which means you now need to provide the bootstrap user data. You can:

  1. Not supply the AMI ID
  2. Have the module add the bootstrap user data with enable_bootstrap_user_data = true reference

Doesn't the data filter gets the right ami, like in the example? At least the deployed code does create the instances with the right ami, they just hang there alone without joining the cluster.

The examples are for demonstrating all the different ways you can use the module as well as for testing changes. Unless you need to use a specific AMI, you don't need to tell EKS managed node groups which specific AMI to use. Instead, you can specify the AMI type which will pull the proper AMI based on the type selected https://docs.aws.amazon.com/eks/latest/APIReference/API_Nodegroup.html#AmazonEKS-Type-Nodegroup-amiType

https://docs.aws.amazon.com/eks/latest/userguide/launch-templates.html#launch-template-custom-ami

@evenme
Copy link
Author

evenme commented Apr 1, 2022

As @bryantbiggs asked, I just deployed this code:

locals {
  cluster_name = "staging2-eks"
  tags = {
    terraform   = "true"
    environment = "stag2"
    usage       = "eks"
  }
  k8s_labels = {
    environment = "stag2"
    region      = "us-east-1"
  }
  node_tags = {
    "k8s.io/cluster-autoscaler/enabled"      = "true"
    "k8s.io/cluster-autoscaler/staging2-eks" = "owned"
  }
}

data "aws_availability_zones" "available" {}

data "aws_caller_identity" "current" {}

data "aws_eks_cluster" "cluster" {
  name = module.eks.cluster_id
}

data "aws_eks_cluster_auth" "cluster" {
  name = module.eks.cluster_id
}

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 3.1"

  name = "${local.cluster_name}-vpc"
  cidr = "10.124.0.0/16"
  private_subnets = [
    "10.124.0.0/19",
    "10.124.32.0/19",
    "10.124.64.0/19"
    ## 10.124.96.0/19	as spare
  ]
  public_subnets = [
    "10.124.128.0/19",
    "10.124.160.0/19",
    "10.124.192.0/19"
    ## 10.124.224.0/19 as spare
  ]
  azs                  = data.aws_availability_zones.available.names
  enable_nat_gateway   = true
  single_nat_gateway   = true
  enable_dns_hostnames = true
  public_subnet_tags = {
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
    "kubernetes.io/role/elb"                      = "1"
  }
  private_subnet_tags = {
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
    "kubernetes.io/role/internal-elb"             = "1"
  }
}


module "eks" {
  create                          = true
  source                          = "terraform-aws-modules/eks/aws"
  cluster_name                    = local.cluster_name
  cluster_version                 = "1.18"
  cluster_endpoint_private_access = true
  cluster_endpoint_public_access  = true
  vpc_id                          = module.vpc.vpc_id
  subnet_ids                      = module.vpc.private_subnets
  tags                            = local.tags


  # IPV4
  cluster_ip_family = "ipv4"

  cluster_addons = {
    coredns = {
      resolve_conflicts = "OVERWRITE"
    }
    kube-proxy = {}
    vpc-cni = {
      resolve_conflicts        = "OVERWRITE"
      service_account_role_arn = module.vpc_cni_irsa.iam_role_arn
    }
  }

  cluster_encryption_config = [{
    provider_key_arn = aws_kms_key.eks.arn
    resources        = ["secrets"]
  }]

  # # Extend cluster security group rules
  cluster_security_group_additional_rules = {
    egress_nodes_ephemeral_ports_tcp = {
      description                = "To node 1025-65535"
      protocol                   = "tcp"
      from_port                  = 1025
      to_port                    = 65535
      type                       = "egress"
      source_node_security_group = true
    }
  }

  # Extend node-to-node security group rules
  node_security_group_additional_rules = {
    ingress_self_all = {
      description = "Node to node all ports/protocols"
      protocol    = "-1"
      from_port   = 0
      to_port     = 0
      type        = "ingress"
      self        = true
    }
    egress_all = {
      description      = "Node all egress"
      protocol         = "-1"
      from_port        = 0
      to_port          = 0
      type             = "egress"
      cidr_blocks      = ["0.0.0.0/0"]
      ipv6_cidr_blocks = ["::/0"]
    }
  }

  eks_managed_node_group_defaults = {
    ami_type       = "AL2_x86_64"
    disk_size      = 50
    instance_types = ["t3.2xlarge", "t3.large", "t3.medium"]

    # We are using the IRSA created below for permissions
    # However, we have to deploy with the policy attached FIRST (when creating a fresh cluster)
    # and then turn this off after the cluster/node group is created. Without this initial policy,
    # the VPC CNI fails to assign IPs and nodes cannot join the cluster
    # See https://github.com/aws/containers-roadmap/issues/1666 for more context
    iam_role_attach_cni_policy = true
  }

  eks_managed_node_groups = {
    # Complete
    complete = {
      name            = "complete-eks-mng"
      use_name_prefix = true

      subnet_ids = module.vpc.private_subnets

      min_size     = 2
      max_size     = 5
      desired_size = 2

      capacity_type        = "ON_DEMAND"
      force_update_version = true
      labels               = local.k8s_labels

      update_config = {
        max_unavailable_percentage = 50 # or set `max_unavailable`
      }

      description = "EKS managed node group example launch template"

      ebs_optimized            = true
      vpc_security_group_ids   = [aws_security_group.additional.id]
      disable_api_termination  = false
      enable_monitoring        = true
      create_iam_role          = true
      iam_role_name            = "${local.cluster_name}-managed-node-group-complete-example"
      iam_role_use_name_prefix = false
      iam_role_description     = "${local.cluster_name} EKS managed node group complete example role"
      # iam_role_tags = {
      #   Purpose = "Protector of the kubelet"
      # }
      iam_role_additional_policies = [
        "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
      ]

      create_security_group          = true
      security_group_name            = "${local.cluster_name}-managed-node-group-complete-example"
      security_group_use_name_prefix = false
      security_group_description     = "${local.cluster_name} EKS managed node group complete example security group"
      security_group_rules = {
        phoneOut = {
          description = "Hello CloudFlare"
          protocol    = "udp"
          from_port   = 53
          to_port     = 53
          type        = "egress"
          cidr_blocks = ["1.1.1.1/32"]
        }
        phoneHome = {
          description                   = "Hello cluster"
          protocol                      = "udp"
          from_port                     = 53
          to_port                       = 53
          type                          = "egress"
          source_cluster_security_group = true # bit of reflection lookup
        }
      }
      # security_group_tags = {
      #   Purpose = "Protector of the kubelet"
      # }
      # remote_access = {
      #   ec2_ssh_key               = local.key_name
      #   source_security_group_ids = [aws_security_group.remote_access.id]
      # }
      tags = local.node_tags
    }
  }
}

resource "aws_iam_role_policy_attachment" "additional" {
  for_each = module.eks.eks_managed_node_groups

  policy_arn = aws_iam_policy.node_additional.arn
  role       = each.value.iam_role_name
}

resource "aws_iam_policy" "node_additional" {
  name        = "${local.cluster_name}-additional"
  description = "Example usage of node additional policy"

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = [
          "ec2:Describe*",
        ]
        Effect   = "Allow"
        Resource = "*"
      },
    ]
  })

  tags = local.tags
}

locals {
  kubeconfig = yamlencode({
    apiVersion      = "v1"
    kind            = "Config"
    current-context = "terraform"
    clusters = [{
      name = module.eks.cluster_id
      cluster = {
        certificate-authority-data = module.eks.cluster_certificate_authority_data
        server                     = module.eks.cluster_endpoint
      }
    }]
    contexts = [{
      name = "terraform"
      context = {
        cluster = module.eks.cluster_id
        user    = "terraform"
      }
    }]
    users = [{
      name = "terraform"
      user = {
        token = data.aws_eks_cluster_auth.cluster.token
      }
    }]
  })
}

resource "null_resource" "patch" {
  triggers = {
    kubeconfig = base64encode(local.kubeconfig)
    cmd_patch  = "kubectl patch configmap/aws-auth --patch \"${module.eks.aws_auth_configmap_yaml}\" -n kube-system --kubeconfig <(echo $KUBECONFIG | base64 --decode)"
  }

  provisioner "local-exec" {
    interpreter = ["/bin/bash", "-c"]
    environment = {
      KUBECONFIG = self.triggers.kubeconfig
    }
    command = self.triggers.cmd_patch
  }
}

module "vpc_cni_irsa" {
  source  = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
  version = "~> 4.12"

  role_name_prefix      = "VPC-CNI-IRSA"
  attach_vpc_cni_policy = true
  vpc_cni_enable_ipv6   = true

  oidc_providers = {
    main = {
      provider_arn               = module.eks.oidc_provider_arn
      namespace_service_accounts = ["kube-system:aws-node"]
    }
  }

  tags = local.tags
}


resource "aws_security_group" "remote_access" {
  name_prefix = "${local.cluster_name}-remote-access"
  description = "Allow remote SSH access"
  vpc_id      = module.vpc.vpc_id

  ingress {
    description = "SSH access"
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["10.0.0.0/8"]
  }

  egress {
    from_port        = 0
    to_port          = 0
    protocol         = "-1"
    cidr_blocks      = ["0.0.0.0/0"]
    ipv6_cidr_blocks = ["::/0"]
  }

  tags = local.tags
}



resource "aws_kms_key" "eks" {
  description             = "EKS Secret Encryption Key"
  deletion_window_in_days = 7
  enable_key_rotation     = true

  tags = local.tags
}

resource "aws_kms_key" "ebs" {
  description             = "Customer managed key to encrypt EKS managed node group volumes"
  deletion_window_in_days = 7
  policy                  = data.aws_iam_policy_document.ebs.json
}

# This policy is required for the KMS key used for EKS root volumes, so the cluster is allowed to enc/dec/attach encrypted EBS volumes
data "aws_iam_policy_document" "ebs" {
  # Copy of default KMS policy that lets you manage it
  statement {
    sid       = "Enable IAM User Permissions"
    actions   = ["kms:*"]
    resources = ["*"]

    principals {
      type        = "AWS"
      identifiers = ["arn:aws:iam::${data.aws_caller_identity.current.account_id}:root"]
    }
  }

  # Required for EKS
  statement {
    sid = "Allow service-linked role use of the CMK"
    actions = [
      "kms:Encrypt",
      "kms:Decrypt",
      "kms:ReEncrypt*",
      "kms:GenerateDataKey*",
      "kms:DescribeKey"
    ]
    resources = ["*"]

    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling", # required for the ASG to manage encrypted volumes for nodes
        module.eks.cluster_iam_role_arn,                                                                                                            # required for the cluster / persistentvolume-controller to create encrypted PVCs
      ]
    }
  }

  statement {
    sid       = "Allow attachment of persistent resources"
    actions   = ["kms:CreateGrant"]
    resources = ["*"]

    principals {
      type = "AWS"
      identifiers = [
        "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/aws-service-role/autoscaling.amazonaws.com/AWSServiceRoleForAutoScaling", # required for the ASG to manage encrypted volumes for nodes
        module.eks.cluster_iam_role_arn,                                                                                                            # required for the cluster / persistentvolume-controller to create encrypted PVCs
      ]
    }

    condition {
      test     = "Bool"
      variable = "kms:GrantIsForAWSResource"
      values   = ["true"]
    }
  }
}

resource "aws_security_group" "additional" {
  name_prefix = "${local.cluster_name}-additional"
  vpc_id      = module.vpc.vpc_id

  ingress {
    from_port = 22
    to_port   = 22
    protocol  = "tcp"
    cidr_blocks = [
      "10.0.0.0/8",
      "172.16.0.0/12",
      "192.168.0.0/16",
    ]
  }

  tags = local.tags
}

After 24 min, we got the failure:

│ Error: error waiting for EKS Node Group (staging2-eks:complete-eks-mng-2022040120183164910000000f) to create:
 unexpected state 'CREATE_FAILED', wanted target 'ACTIVE'. last error: 1 error occurred:
│       * i-0a3549866ff59f63c, i-0c88796e07310cb10: NodeCreationFailure: Instances failed to join the kubernete
s cluster
│
│
│
│   with module.eks.module.eks_managed_node_group["complete"].aws_eks_node_group.this[0],
│   on .terraform/modules/eks/modules/eks-managed-node-group/main.tf line 269, in resource "aws_eks_node_group"
 "this":
│  269: resource "aws_eks_node_group" "this" {
│

The cluster got created, node group as well as the node instances got created, however, the node instances did not join the cluster (but they're listed on the ec2 instances when filtering for cluster-name=staging2-eks"). The add-ons were not deployed (none of them).

@bryantbiggs
Copy link
Member

actually @Zvikan spotted the issue - in your vpc-cni IRSA role @evenme you need to change vpc_cni_enable_ipv6 = true to vpc_cni_enable_ipv4 = true

@bryantbiggs
Copy link
Member

closing out for now - please see above and #1910 (comment)

@bpar476
Copy link

bpar476 commented Aug 15, 2022

Sorry to resurrect a dead issue, but shouldn't it be possible to create the cluster without setting iam_role_attach_cni_policy = true as long as the VPC CNI addon service account has the appropriate IAM role? I tried this and the creation failed because the node groups never became ready.

It looks like the VPC CNI addon installation depends on the nodegroups to be created, is this required? I'm not quite sure how the nodes can join the cluster in the first place if the VPC CNI addon isn't installed.

Is it a case that for initial provisioning the node has to have the VPC CNI policies attached and then you can migrate to IRSA for the addon?

@bryantbiggs
Copy link
Member

correct - https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/examples/eks_managed_node_group/main.tf#L46-L50

@github-actions
Copy link

github-actions bot commented Nov 9, 2022

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 9, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants