-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setting cluster_endpoint_public_access_cidrs a value makes the nodegroup not join cluster #1867
Comments
I also face the same issue |
can you provide a full reproduction please |
@bryantbiggs I am trying to create a EKS cluster with managed node group , All i do is supply a list of external ips Error: error waiting for EKS Node Group (abc- default_node_group-20220210235029685300000001) to create: unexpected state 'CREATE_FAILED', wanted target 'ACTIVE'. last error: 1 error occurred: |
module "eks" { cluster_name = local.name |
If I comment cluster_endpoint_public_access_cidrs or give a value of 0.0.0.0/0 the code succeeds |
Im using the example under examples/eks_managed_node_group |
I have the same issue here. Basically when cluster_endpoint_public_access_cidrs is limited to some CIDRs, the node groups can't join the cluster and timed out. cluster_endpoint_private_access = true Terraform v0.13.7 |
Mine is working with provider version |
Maybe this may help? #1889 (comment) |
@bryantbiggs I get this same issue, here is a pretty minimal example showing the issue: terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 3.27"
}
}
required_version = ">= 0.14.9"
}
provider "aws" {
# Must match the profile name in your ~/.okta_aws_login_config file
profile = "<profile>"
region = "us-east-1"
}
locals {
name = "test-1"
cluster_version = "1.20"
region = "us-east-1"
}
data "aws_caller_identity" "current" {}
################################################################################
# EKS Module
################################################################################
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "18.8.1"
cluster_name = local.name
cluster_version = local.cluster_version
cluster_endpoint_private_access = true
cluster_endpoint_public_access = true
cluster_endpoint_public_access_cidrs = [<cidrs redacted>]
vpc_id = module.vpc.vpc_id
subnet_ids = concat(
module.vpc.private_subnets,
module.vpc.public_subnets,
)
eks_managed_node_group_defaults = {
disk_size = 50
instance_types = ["m5.large"]
# iam_role_attach_cni_policy = true
}
eks_managed_node_groups = {
# Default node group - as provided by AWS EKS
default_node_group = {
}
}
}
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "3.12.0"
name = local.name
cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
public_subnets = ["10.0.101.0/24", "10.0.102.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
one_nat_gateway_per_az = false
}
resource "tls_private_key" "this" {
algorithm = "RSA"
}
resource "aws_key_pair" "this" {
key_name_prefix = local.name
public_key = tls_private_key.this.public_key_openssh
} |
@bryantbiggs With regards to the example above, setting the public_access_cidrs to cluster_endpoint_public_access_cidrs = concat(
[<redacted cidrs>],
["${module.vpc.nat_public_ips[0]}/32"],
) makes it work. EDIT: Alternatively, it seems that adding the following to the VPC config sometimes makes it work too, but this seems to be inconsistent in how long the nodegroup takes to connect, varying between ~2m and ~9m enable_dns_support = true
enable_dns_hostnames = true |
@tculp Would you please help clarifying the reference to NAT public IPs?
|
Adding the NAT addresses of the VPC the cluster is in does solve the issue. I have enable_dns_support and enable_dns_hostnames but neither solved the problem. I also don't have a private endpoint exposed - only a public one. |
This issue has been resolved in version 18.19.0 🎉 |
how did you guys solve the issue?
This is my configuration
|
@pen-pal https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/network_connectivity.md#public-endpoint-w-restricted-cidrs There is no fix required by the module, its up to users to ensure VPC network connectivity is setup properly |
In my case both private and public access are set to true, but yet it fails. To add more to this, the same configuration works and node joins the cluster, if I spin up a completely new eks cluster. PS: I am upgrading from v17.x to v18.x and using the latest tag on my module |
if you have a full reproduction I can take a look - but I would look at the examples as these all work as intended |
what should I share with you to reproduce the issue? |
a deployable reproduction |
Here is the configuration as asked |
you don't need to provide the AMI ID unless you are using a custom AMI. also, if you are just figuring things out then I suggest starting with the minimal amount of configuration and only tweaking/modifying when its necessary module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 3.0"
name = "cluster01"
cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b", "us-east-1c"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnets = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
enable_dns_hostnames = true
public_subnet_tags = {
"kubernetes.io/cluster/cluster01" = "shared"
"kubernetes.io/role/elb" = 1
}
private_subnet_tags = {
"kubernetes.io/cluster/cluster01" = "shared"
"kubernetes.io/role/internal-elb" = 1
}
}
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 18.0"
cluster_name = "cluster01"
cluster_version = "1.21"
cluster_endpoint_private_access = true
cluster_security_group_name = "cluster01-security-group"
cluster_security_group_description = "EKS cluster security group."
iam_role_name = "cluster01-iam-role"
enable_irsa = true
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
cluster_addons = {
coredns = {
resolve_conflicts = "OVERWRITE"
}
kube-proxy = {
resolve_conflicts = "OVERWRITE"
}
}
eks_managed_node_groups = {
complete = {
name = "nodegroup01"
min_size = 1
max_size = 3
desired_size = 1
force_update_version = true
instance_types = ["m5.large"]
update_config = {
max_unavailable_percentage = 50 # or set `max_unavailable`
}
metadata_options = {
http_endpoint = "enabled"
http_tokens = "required"
http_put_response_hop_limit = 2
instance_metadata_tags = "disabled"
}
}
}
} |
the reason for using ami_id was to make sure the post_bootstrap_user_data gets executed. as it was not allowing the user-data to be used. |
does it have to execute after the node joins the cluster? you can use the |
also, your config does not restrict the clusters public endpoint so I don't see how this is relevant to the original issue above |
@bryantbiggs yes. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
Description
While creating a EKS cluster if I were to give addresses in cluster_endpoint_public_access_cidrs , the nodegroup isnt able to join the cluster. Cluster creation reports the following error
module.eks.module.eks_managed_node_group["default_node_group"].aws_eks_node_group.this[0]: Still creating... [26m41s elapsed]
╷
│ Error: error waiting for EKS Node Group (abc- default_node_group-20220210235029685300000001) to create: unexpected state 'CREATE_FAILED', wanted target 'ACTIVE'. last error: 1 error occurred:
│ * i-053f38656f5925c75: NodeCreationFailure: Instances failed to join the kubernetes cluster
If I were to change the cluster_endpoint_public_access_cidrs ( from 0.0.0.0/0 to any other ip) after the cluster creation it works fine
Before you submit an issue, please perform the following first:
.terraform
directory (! ONLY if state is stored remotely, which hopefully you are following that best practice!):rm -rf .terraform/
terraform init
Versions
Reproduction
Steps to reproduce the behavior:
Create a cluster using the managed nodegroup example with a value set for cluster_endpoint_public_access_cidrs
Code Snippet to Reproduce
module "eks" {
#source = "../.."
source = "terraform-aws-modules/eks/aws"
cluster_name = local.name
cluster_version = local.cluster_version
cluster_endpoint_private_access = true
cluster_endpoint_public_access = true
cluster_endpoint_public_access_cidrs = var.cluster_endpoint_public_access_cidrs
Expected behavior
Cluster creation should be successful and nodegroups should join the cluster
Actual behavior
I could see the ip addresses in Public access source allowlist of the cluster but I dont see the nodegroups under that as the terraform errors out stating NodeCreationFailure: Instances failed to join the kubernetes cluster
Terminal Output Screenshot(s)
Error: error waiting for EKS Node Group (eks:default_node_group-20220210235029685300000001) to create: unexpected state 'CREATE_FAILED', wanted target 'ACTIVE'. last error: 1 error occurred:
│ * i-053f38656f5925c75: NodeCreationFailure: Instances failed to join the kubernetes cluster
Additional context
My requirement is to have a Nodegroup created in a private subnet ( SDWAN connected) and have them talk to the EKS cluster which has private and public endpoint. In the public endpoint I want to restrict the IP addresses which can connect to it.
The text was updated successfully, but these errors were encountered: