Skip to content

Commit e3bf48b

Browse files
authored
feat: Replace the local-exec script with a http datasource for waiting cluster (#1339)
NOTES: Using the [terraform-aws-modules/http](https://registry.terraform.io/providers/terraform-aws-modules/http/latest) provider is a more platform agnostic way to wait for the cluster availability than using a local-exec. With this change we're able to provision EKS clusters and manage the `aws_auth` configmap while still using the `hashicorp/tfc-agent` docker image.
1 parent 781f673 commit e3bf48b

File tree

7 files changed

+35
-64
lines changed

7 files changed

+35
-64
lines changed

README.md

+9-15
Original file line numberDiff line numberDiff line change
@@ -25,11 +25,7 @@ You also need to ensure your applications and add ons are updated, or workloads
2525

2626
An example of harming update was the removal of several commonly used, but deprecated APIs, in Kubernetes 1.16. More information on the API removals, see the [Kubernetes blog post](https://kubernetes.io/blog/2019/07/18/api-deprecations-in-1-16/).
2727

28-
By default, this module manages the `aws-auth` configmap for you (`manage_aws_auth=true`). To avoid the following [issue](https://github.com/aws/containers-roadmap/issues/654) where the EKS creation is `ACTIVE` but not ready, we implemented a retry logic with an `local-exec` provisioner and `wget` (by default) with failover to `curl`.
29-
30-
**If you want to manage your `aws-auth` configmap, ensure you have `wget` (or `curl`) and `/bin/sh` installed where you're running Terraform or set `wait_for_cluster_cmd` and `wait_for_cluster_interpreter` to match your needs.**
31-
32-
For windows users, please read the following [doc](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/faq.md#deploying-from-windows-binsh-file-does-not-exist).
28+
By default, this module manages the `aws-auth` configmap for you (`manage_aws_auth=true`). To avoid the following [issue](https://github.com/aws/containers-roadmap/issues/654) where the EKS creation is `ACTIVE` but not ready. We implemented a "retry" logic with a fork of the http provider https://github.com/terraform-aws-modules/terraform-provider-http. This fork adds the support of a self-signed CA certificate. The original PR can be found at https://github.com/hashicorp/terraform-provider-http/pull/29.
3329

3430
## Usage example
3531

@@ -145,21 +141,21 @@ MIT Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraform-a
145141
| Name | Version |
146142
|------|---------|
147143
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 0.13.1 |
148-
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 3.35.0 |
144+
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 3.37.0 |
145+
| <a name="requirement_http"></a> [http](#requirement\_http) | >= 2.2.0 |
149146
| <a name="requirement_kubernetes"></a> [kubernetes](#requirement\_kubernetes) | >= 1.11.1 |
150147
| <a name="requirement_local"></a> [local](#requirement\_local) | >= 1.4 |
151-
| <a name="requirement_null"></a> [null](#requirement\_null) | >= 2.1 |
152148
| <a name="requirement_random"></a> [random](#requirement\_random) | >= 2.1 |
153149
| <a name="requirement_template"></a> [template](#requirement\_template) | >= 2.1 |
154150

155151
## Providers
156152

157153
| Name | Version |
158154
|------|---------|
159-
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 3.35.0 |
155+
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 3.37.0 |
156+
| <a name="provider_http"></a> [http](#provider\_http) | >= 2.2.0 |
160157
| <a name="provider_kubernetes"></a> [kubernetes](#provider\_kubernetes) | >= 1.11.1 |
161158
| <a name="provider_local"></a> [local](#provider\_local) | >= 1.4 |
162-
| <a name="provider_null"></a> [null](#provider\_null) | >= 2.1 |
163159
| <a name="provider_random"></a> [random](#provider\_random) | >= 2.1 |
164160
| <a name="provider_template"></a> [template](#provider\_template) | >= 2.1 |
165161

@@ -208,7 +204,6 @@ MIT Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraform-a
208204
| [aws_security_group_rule.workers_ingress_self](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/security_group_rule) | resource |
209205
| [kubernetes_config_map.aws_auth](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/config_map) | resource |
210206
| [local_file.kubeconfig](https://registry.terraform.io/providers/hashicorp/local/latest/docs/resources/file) | resource |
211-
| [null_resource.wait_for_cluster](https://registry.terraform.io/providers/hashicorp/null/latest/docs/resources/resource) | resource |
212207
| [random_pet.workers](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/pet) | resource |
213208
| [random_pet.workers_launch_template](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/pet) | resource |
214209
| [aws_ami.eks_worker](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/ami) | data source |
@@ -221,6 +216,7 @@ MIT Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraform-a
221216
| [aws_iam_policy_document.workers_assume_role_policy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
222217
| [aws_iam_role.custom_cluster_iam_role](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_role) | data source |
223218
| [aws_partition.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/partition) | data source |
219+
| [http_http.wait_for_cluster](https://registry.terraform.io/providers/terraform-aws-modules/http/latest/docs/data-sources/http) | data source |
224220
| [template_file.launch_template_userdata](https://registry.terraform.io/providers/hashicorp/template/latest/docs/data-sources/file) | data source |
225221
| [template_file.userdata](https://registry.terraform.io/providers/hashicorp/template/latest/docs/data-sources/file) | data source |
226222

@@ -273,8 +269,6 @@ MIT Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraform-a
273269
| <a name="input_subnets"></a> [subnets](#input\_subnets) | A list of subnets to place the EKS cluster and workers within. | `list(string)` | n/a | yes |
274270
| <a name="input_tags"></a> [tags](#input\_tags) | A map of tags to add to all resources. Tags added to launch configuration or templates override these values for ASG Tags only. | `map(string)` | `{}` | no |
275271
| <a name="input_vpc_id"></a> [vpc\_id](#input\_vpc\_id) | VPC where the cluster and workers will be deployed. | `string` | n/a | yes |
276-
| <a name="input_wait_for_cluster_cmd"></a> [wait\_for\_cluster\_cmd](#input\_wait\_for\_cluster\_cmd) | Custom local-exec command to execute for determining if the eks cluster is healthy. Cluster endpoint will be available as an environment variable called ENDPOINT | `string` | `"for i in `seq 1 60`; do if `command -v wget > /dev/null`; then wget --no-check-certificate -O - -q $ENDPOINT/healthz >/dev/null && exit 0 || true; else curl -k -s $ENDPOINT/healthz >/dev/null && exit 0 || true;fi; sleep 5; done; echo TIMEOUT && exit 1"` | no |
277-
| <a name="input_wait_for_cluster_interpreter"></a> [wait\_for\_cluster\_interpreter](#input\_wait\_for\_cluster\_interpreter) | Custom local-exec command line interpreter for the command to determining if the eks cluster is healthy. | `list(string)` | <pre>[<br> "/bin/sh",<br> "-c"<br>]</pre> | no |
278272
| <a name="input_worker_additional_security_group_ids"></a> [worker\_additional\_security\_group\_ids](#input\_worker\_additional\_security\_group\_ids) | A list of additional security group ids to attach to worker instances | `list(string)` | `[]` | no |
279273
| <a name="input_worker_ami_name_filter"></a> [worker\_ami\_name\_filter](#input\_worker\_ami\_name\_filter) | Name filter for AWS EKS worker AMI. If not provided, the latest official AMI for the specified 'cluster\_version' is used. | `string` | `""` | no |
280274
| <a name="input_worker_ami_name_filter_windows"></a> [worker\_ami\_name\_filter\_windows](#input\_worker\_ami\_name\_filter\_windows) | Name filter for AWS EKS Windows worker AMI. If not provided, the latest official AMI for the specified 'cluster\_version' is used. | `string` | `""` | no |
@@ -304,7 +298,7 @@ MIT Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraform-a
304298
| <a name="output_cluster_endpoint"></a> [cluster\_endpoint](#output\_cluster\_endpoint) | The endpoint for your EKS Kubernetes API. |
305299
| <a name="output_cluster_iam_role_arn"></a> [cluster\_iam\_role\_arn](#output\_cluster\_iam\_role\_arn) | IAM role ARN of the EKS cluster. |
306300
| <a name="output_cluster_iam_role_name"></a> [cluster\_iam\_role\_name](#output\_cluster\_iam\_role\_name) | IAM role name of the EKS cluster. |
307-
| <a name="output_cluster_id"></a> [cluster\_id](#output\_cluster\_id) | The name/id of the EKS cluster. Will block on cluster creation until the cluster is really ready |
301+
| <a name="output_cluster_id"></a> [cluster\_id](#output\_cluster\_id) | The name/id of the EKS cluster. Will block on cluster creation until the cluster is really ready. |
308302
| <a name="output_cluster_oidc_issuer_url"></a> [cluster\_oidc\_issuer\_url](#output\_cluster\_oidc\_issuer\_url) | The URL on the EKS cluster OIDC Issuer |
309303
| <a name="output_cluster_primary_security_group_id"></a> [cluster\_primary\_security\_group\_id](#output\_cluster\_primary\_security\_group\_id) | The cluster primary security group ID created by the EKS cluster on 1.14 or later. Referred to as 'Cluster security group' in the EKS console. |
310304
| <a name="output_cluster_security_group_id"></a> [cluster\_security\_group\_id](#output\_cluster\_security\_group\_id) | Security group ID attached to the EKS cluster. On 1.14 or later, this is the 'Additional security groups' in the EKS console. |
@@ -314,8 +308,8 @@ MIT Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraform-a
314308
| <a name="output_fargate_iam_role_name"></a> [fargate\_iam\_role\_name](#output\_fargate\_iam\_role\_name) | IAM role name for EKS Fargate pods |
315309
| <a name="output_fargate_profile_arns"></a> [fargate\_profile\_arns](#output\_fargate\_profile\_arns) | Amazon Resource Name (ARN) of the EKS Fargate Profiles. |
316310
| <a name="output_fargate_profile_ids"></a> [fargate\_profile\_ids](#output\_fargate\_profile\_ids) | EKS Cluster name and EKS Fargate Profile names separated by a colon (:). |
317-
| <a name="output_kubeconfig"></a> [kubeconfig](#output\_kubeconfig) | kubectl config file contents for this EKS cluster. |
318-
| <a name="output_kubeconfig_filename"></a> [kubeconfig\_filename](#output\_kubeconfig\_filename) | The filename of the generated kubectl config. |
311+
| <a name="output_kubeconfig"></a> [kubeconfig](#output\_kubeconfig) | kubectl config file contents for this EKS cluster. Will block on cluster creation until the cluster is really ready. |
312+
| <a name="output_kubeconfig_filename"></a> [kubeconfig\_filename](#output\_kubeconfig\_filename) | The filename of the generated kubectl config. Will block on cluster creation until the cluster is really ready. |
319313
| <a name="output_node_groups"></a> [node\_groups](#output\_node\_groups) | Outputs from EKS node groups. Map of maps, keyed by var.node\_groups keys |
320314
| <a name="output_oidc_provider_arn"></a> [oidc\_provider\_arn](#output\_oidc\_provider\_arn) | The ARN of the OIDC Provider if `enable_irsa = true`. |
321315
| <a name="output_security_group_rule_cluster_https_worker_ingress"></a> [security\_group\_rule\_cluster\_https\_worker\_ingress](#output\_security\_group\_rule\_cluster\_https\_worker\_ingress) | Security group rule responsible for allowing pods to communicate with the EKS cluster API. |

aws_auth.tf

+2-2
Original file line numberDiff line numberDiff line change
@@ -64,15 +64,15 @@ locals {
6464

6565
resource "kubernetes_config_map" "aws_auth" {
6666
count = var.create_eks && var.manage_aws_auth ? 1 : 0
67-
depends_on = [null_resource.wait_for_cluster[0]]
67+
depends_on = [data.http.wait_for_cluster[0]]
6868

6969
metadata {
7070
name = "aws-auth"
7171
namespace = "kube-system"
7272
labels = merge(
7373
{
7474
"app.kubernetes.io/managed-by" = "Terraform"
75-
# / are replaced by . because label validator fails in this lib
75+
# / are replaced by . because label validator fails in this lib
7676
# https://github.com/kubernetes/apimachinery/blob/1bdd76d09076d4dc0362456e59c8f551f5f24a72/pkg/util/validation/validation.go#L166
7777
"terraform.io/module" = "terraform-aws-modules.eks.aws"
7878
},

cluster.tf

+4-15
Original file line numberDiff line numberDiff line change
@@ -64,21 +64,10 @@ resource "aws_security_group_rule" "cluster_private_access" {
6464
}
6565

6666

67-
resource "null_resource" "wait_for_cluster" {
68-
count = var.create_eks && var.manage_aws_auth ? 1 : 0
69-
70-
depends_on = [
71-
aws_eks_cluster.this,
72-
aws_security_group_rule.cluster_private_access,
73-
]
74-
75-
provisioner "local-exec" {
76-
command = var.wait_for_cluster_cmd
77-
interpreter = var.wait_for_cluster_interpreter
78-
environment = {
79-
ENDPOINT = aws_eks_cluster.this[0].endpoint
80-
}
81-
}
67+
data "http" "wait_for_cluster" {
68+
count = var.create_eks && var.manage_aws_auth ? 1 : 0
69+
url = format("%s/healthz", aws_eks_cluster.this[0].endpoint)
70+
ca_certificate = base64decode(coalescelist(aws_eks_cluster.this[*].certificate_authority[0].data, [""])[0])
8271
}
8372

8473
resource "aws_security_group" "cluster" {

docs/faq.md

+1-13
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ You do not need to do anything extra since v12.1.0 of the module as long as the
107107
- `manage_aws_auth = true` on the module (default)
108108
- the kubernetes provider is correctly configured like in the [Usage Example](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/README.md#usage-example). Primarily the module's `cluster_id` output is used as input to the `aws_eks_cluster*` data sources.
109109

110-
The `cluster_id` depends on a `null_resource` that polls the EKS cluster's endpoint until it is alive. This blocks initialisation of the kubernetes provider.
110+
The `cluster_id` depends on a `data.http.wait_for_cluster` that polls the EKS cluster's endpoint until it is alive. This blocks initialisation of the kubernetes provider.
111111

112112
## `aws_auth.tf: At 2:14: Unknown token: 2:14 IDENT`
113113

@@ -170,18 +170,6 @@ worker_groups = [
170170

171171
4. With `kubectl get nodes` you can see cluster with mixed (Linux/Windows) nodes support.
172172

173-
## Deploying from Windows: `/bin/sh` file does not exist
174-
175-
The module is almost pure Terraform apart from the `wait_for_cluster` `null_resource` that runs a local provisioner. The module has a default configuration for Unix-like systems. In order to run the provisioner on Windows systems you must set the interpreter to a valid value. [PR #795 (comment)](https://github.com/terraform-aws-modules/terraform-aws-eks/pull/795#issuecomment-599191029) suggests the following value:
176-
```hcl
177-
module "eks" {
178-
# ...
179-
wait_for_cluster_interpreter = ["c:/git/bin/sh.exe", "-c"]
180-
}
181-
```
182-
183-
Alternatively, you can disable the `null_resource` by disabling creation of the `aws-auth` ConfigMap via setting `manage_aws_auth = false` on the module. The ConfigMap will then need creating via a different method.
184-
185173
## Worker nodes with labels do not join a 1.16+ cluster
186174

187175
Kubelet restricts the allowed list of labels in the `kubernetes.io` namespace that can be applied to nodes starting in 1.16.

outputs.tf

+15-6
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,10 @@
11
output "cluster_id" {
2-
description = "The name/id of the EKS cluster. Will block on cluster creation until the cluster is really ready"
2+
description = "The name/id of the EKS cluster. Will block on cluster creation until the cluster is really ready."
33
value = element(concat(aws_eks_cluster.this.*.id, [""]), 0)
4-
# So that calling plans wait for the cluster to be available before attempting
5-
# to use it. They will not need to duplicate this null_resource
6-
depends_on = [null_resource.wait_for_cluster]
4+
5+
# So that calling plans wait for the cluster to be available before attempting to use it.
6+
# There is no need to duplicate this datasource
7+
depends_on = [data.http.wait_for_cluster]
78
}
89

910
output "cluster_arn" {
@@ -67,13 +68,21 @@ output "cloudwatch_log_group_arn" {
6768
}
6869

6970
output "kubeconfig" {
70-
description = "kubectl config file contents for this EKS cluster."
71+
description = "kubectl config file contents for this EKS cluster. Will block on cluster creation until the cluster is really ready."
7172
value = local.kubeconfig
73+
74+
# So that calling plans wait for the cluster to be available before attempting to use it.
75+
# There is no need to duplicate this datasource
76+
depends_on = [data.http.wait_for_cluster]
7277
}
7378

7479
output "kubeconfig_filename" {
75-
description = "The filename of the generated kubectl config."
80+
description = "The filename of the generated kubectl config. Will block on cluster creation until the cluster is really ready."
7681
value = concat(local_file.kubeconfig.*.filename, [""])[0]
82+
83+
# So that calling plans wait for the cluster to be available before attempting to use it.
84+
# There is no need to duplicate this datasource
85+
depends_on = [data.http.wait_for_cluster]
7786
}
7887

7988
output "oidc_provider_arn" {

variables.tf

-12
Original file line numberDiff line numberDiff line change
@@ -205,18 +205,6 @@ variable "cluster_delete_timeout" {
205205
default = "15m"
206206
}
207207

208-
variable "wait_for_cluster_cmd" {
209-
description = "Custom local-exec command to execute for determining if the eks cluster is healthy. Cluster endpoint will be available as an environment variable called ENDPOINT"
210-
type = string
211-
default = "for i in `seq 1 60`; do if `command -v wget > /dev/null`; then wget --no-check-certificate -O - -q $ENDPOINT/healthz >/dev/null && exit 0 || true; else curl -k -s $ENDPOINT/healthz >/dev/null && exit 0 || true;fi; sleep 5; done; echo TIMEOUT && exit 1"
212-
}
213-
214-
variable "wait_for_cluster_interpreter" {
215-
description = "Custom local-exec command line interpreter for the command to determining if the eks cluster is healthy."
216-
type = list(string)
217-
default = ["/bin/sh", "-c"]
218-
}
219-
220208
variable "cluster_create_security_group" {
221209
description = "Whether to create a security group for the cluster or attach the cluster to `cluster_security_group_id`."
222210
type = bool

versions.tf

+4-1
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,12 @@ terraform {
44
required_providers {
55
aws = ">= 3.37.0"
66
local = ">= 1.4"
7-
null = ">= 2.1"
87
template = ">= 2.1"
98
random = ">= 2.1"
109
kubernetes = ">= 1.11.1"
10+
http = {
11+
source = "terraform-aws-modules/http"
12+
version = ">= 2.2.0"
13+
}
1114
}
1215
}

0 commit comments

Comments
 (0)