MalformedPolicyDocumentException #2523

JPurcellHCP opened this issue Mar 16, 2023 · 5 comments
JPurcellHCP opened this issue Mar 16, 2023 · 5 comments


JPurcellHCP commented Mar 16, 2023


Using the /examples/eks_managed_node_group cloned as of today. Running a terraform init & plan is fine and shows the expected output. But when running an apply it fails at the stage of creating the KMS key.

The error code provided is
╷ │ Error: creating KMS Key: MalformedPolicyDocumentException: Policy contains a statement with one or more invalid principals. │ │ with module.ebs_kms_key.aws_kms_key.this[0], │ on .terraform/modules/ebs_kms_key/ line 8, in resource "aws_kms_key" "this": │ 8: resource "aws_kms_key" "this" { │ ╵ Operation failed: failed running terraform apply (exit 1)

  • ✋ I have searched the open/closed issues and my issue is not listed.


  • Module version [Required]: Latest

  • Terraform version:

  • Provider version(s):
    AWS >= 4.47
    Kubernetes >= 2.10
    Terraform >=1.0

Reproduction Code [Required]

provider "aws" {
  region = local.region

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = ""
    command     = "aws"
    # This requires the awscli to be installed locally where Terraform is executed
    args = ["eks", "get-token", "--cluster-name", module.eks.cluster_name]

data "aws_caller_identity" "current" {}
data "aws_availability_zones" "available" {}

locals {
  name            = "ex-${replace(basename(path.cwd), "_", "-")}"
  cluster_version = "1.24"
  region          = "eu-west-1"

  vpc_cidr = ""
  azs      = slice(data.aws_availability_zones.available.names, 0, 3)

  tags = {
    Example    =
    GithubRepo = "terraform-aws-eks"
    GithubOrg  = "terraform-aws-modules"

# EKS Module

module "eks" {
  source = "terraform-aws-modules/eks/aws"

  cluster_name                   =
  cluster_version                = local.cluster_version
  cluster_endpoint_public_access = true

  # IPV6
  cluster_ip_family = "ipv6"

  # We are using the IRSA created below for permissions
  # However, we have to deploy with the policy attached FIRST (when creating a fresh cluster)
  # and then turn this off after the cluster/node group is created. Without this initial policy,
  # the VPC CNI fails to assign IPs and nodes cannot join the cluster
  # See for more context
  # TODO - remove this policy once AWS releases a managed version similar to AmazonEKS_CNI_Policy (IPv4)
  create_cni_ipv6_iam_policy = true

  cluster_addons = {
    coredns = {
      most_recent = true
    kube-proxy = {
      most_recent = true
    vpc-cni = {
      most_recent              = true
      before_compute           = true
      service_account_role_arn = module.vpc_cni_irsa.iam_role_arn
      configuration_values = jsonencode({
        env = {
          # Reference docs
          WARM_PREFIX_TARGET       = "1"

  vpc_id                   = module.vpc.vpc_id
  subnet_ids               = module.vpc.private_subnets
  control_plane_subnet_ids = module.vpc.intra_subnets

  manage_aws_auth_configmap = true

  eks_managed_node_group_defaults = {
    ami_type       = "AL2_x86_64"
    instance_types = ["m6i.large", "m5.large", "m5n.large", "m5zn.large"]

    # We are using the IRSA created below for permissions
    # However, we have to deploy with the policy attached FIRST (when creating a fresh cluster)
    # and then turn this off after the cluster/node group is created. Without this initial policy,
    # the VPC CNI fails to assign IPs and nodes cannot join the cluster
    # See for more context
    iam_role_attach_cni_policy = true

  eks_managed_node_groups = {
    # Default node group - as provided by AWS EKS
    default_node_group = {
      # By default, the module creates a launch template to ensure tags are propagated to instances, etc.,
      # so we need to disable it to use the default template provided by the AWS EKS managed node group service
      use_custom_launch_template = false

      disk_size = 50

      # Remote access cannot be specified with a launch template
      remote_access = {
        ec2_ssh_key               = module.key_pair.key_pair_name
        source_security_group_ids = []

    # Default node group - as provided by AWS EKS using Bottlerocket
    bottlerocket_default = {
      # By default, the module creates a launch template to ensure tags are propagated to instances, etc.,
      # so we need to disable it to use the default template provided by the AWS EKS managed node group service
      use_custom_launch_template = false

      ami_type = "BOTTLEROCKET_x86_64"
      platform = "bottlerocket"

    # Adds to the AWS provided user data
    bottlerocket_add = {
      ami_type = "BOTTLEROCKET_x86_64"
      platform = "bottlerocket"

      # This will get added to what AWS provides
      bootstrap_extra_args = <<-EOT
        # extra args added
        lockdown = "integrity"

    # Custom AMI, using module provided bootstrap data
    bottlerocket_custom = {
      # Current bottlerocket AMI
      ami_id   = data.aws_ami.eks_default_bottlerocket.image_id
      platform = "bottlerocket"

      # Use module user data template to bootstrap
      enable_bootstrap_user_data = true
      # This will get added to the template
      bootstrap_extra_args = <<-EOT
        # The admin host container provides SSH access and runs with "superpowers".
        # It is disabled by default, but can be disabled explicitly.
        enabled = false

        # The control host container provides out-of-band access via SSM.
        # It is enabled by default, and can be disabled if you do not expect to use SSM.
        # This could leave you with no way to access the API and change settings on an existing node!
        enabled = true

        # extra args added
        lockdown = "integrity"

        label1 = "foo"
        label2 = "bar"

        dedicated = "experimental:PreferNoSchedule"
        special = "true:NoSchedule"

    # Use a custom AMI
    custom_ami = {
      ami_type = "AL2_ARM_64"
      # Current default AMI used by managed node groups - pseudo "custom"
      ami_id = data.aws_ami.eks_default_arm.image_id

      # This will ensure the bootstrap user data is used to join the node
      # By default, EKS managed node groups will not append bootstrap script;
      # this adds it back in using the default template provided by the module
      # Note: this assumes the AMI provided is an EKS optimized AMI derivative
      enable_bootstrap_user_data = true

      instance_types = ["t4g.medium"]

    # Complete
    complete = {
      name            = "complete-eks-mng"
      use_name_prefix = true

      subnet_ids = module.vpc.private_subnets

      min_size     = 1
      max_size     = 7
      desired_size = 1

      ami_id                     = data.aws_ami.eks_default.image_id
      enable_bootstrap_user_data = true

      pre_bootstrap_user_data = <<-EOT
        export FOO=bar

      post_bootstrap_user_data = <<-EOT
        echo "you are free little kubelet!"

      capacity_type        = "SPOT"
      force_update_version = true
      instance_types       = ["m6i.large", "m5.large", "m5n.large", "m5zn.large"]
      labels = {
        GithubRepo = "terraform-aws-eks"
        GithubOrg  = "terraform-aws-modules"

      taints = [
          key    = "dedicated"
          value  = "gpuGroup"
          effect = "NO_SCHEDULE"

      update_config = {
        max_unavailable_percentage = 33 # or set `max_unavailable`

      description = "EKS managed node group example launch template"

      ebs_optimized           = true
      disable_api_termination = false
      enable_monitoring       = true

      block_device_mappings = {
        xvda = {
          device_name = "/dev/xvda"
          ebs = {
            volume_size           = 75
            volume_type           = "gp3"
            iops                  = 3000
            throughput            = 150
            encrypted             = true
            kms_key_id            = module.ebs_kms_key.key_arn
            delete_on_termination = true

      metadata_options = {
        http_endpoint               = "enabled"
        http_tokens                 = "required"
        http_put_response_hop_limit = 2
        instance_metadata_tags      = "disabled"

      create_iam_role          = true
      iam_role_name            = "eks-managed-node-group-complete-example"
      iam_role_use_name_prefix = false
      iam_role_description     = "EKS managed node group complete example role"
      iam_role_tags = {
        Purpose = "Protector of the kubelet"
      iam_role_additional_policies = {
        AmazonEC2ContainerRegistryReadOnly = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
        additional                         = aws_iam_policy.node_additional.arn

      tags = {
        ExtraTag = "EKS managed node group complete example"

  tags = local.tags

# Supporting Resources

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 3.0"

  name =
  cidr = local.vpc_cidr

  azs             = local.azs
  private_subnets = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 4, k)]
  public_subnets  = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 8, k + 48)]
  intra_subnets   = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 8, k + 52)]

  enable_ipv6                     = true
  assign_ipv6_address_on_creation = true
  create_egress_only_igw          = true

  public_subnet_ipv6_prefixes  = [0, 1, 2]
  private_subnet_ipv6_prefixes = [3, 4, 5]
  intra_subnet_ipv6_prefixes   = [6, 7, 8]

  enable_nat_gateway   = true
  single_nat_gateway   = true
  enable_dns_hostnames = true

  enable_flow_log                      = true
  create_flow_log_cloudwatch_iam_role  = true
  create_flow_log_cloudwatch_log_group = true

  public_subnet_tags = {
    "" = 1

  private_subnet_tags = {
    "" = 1

  tags = local.tags

module "vpc_cni_irsa" {
  source  = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
  version = "~> 5.0"

  role_name_prefix      = "VPC-CNI-IRSA"
  attach_vpc_cni_policy = true
  vpc_cni_enable_ipv6   = true

  oidc_providers = {
    main = {
      provider_arn               = module.eks.oidc_provider_arn
      namespace_service_accounts = ["kube-system:aws-node"]

  tags = local.tags

module "ebs_kms_key" {
  source  = "terraform-aws-modules/kms/aws"
  version = " 1.5.0"

  description = "Customer managed key to encrypt EKS managed node group volumes"

  # Policy
  key_administrators = [

  key_service_roles_for_autoscaling = [
    # required for the ASG to manage encrypted volumes for nodes
    # required for the cluster / persistentvolume-controller to create encrypted PVCs

  # Aliases
  aliases = ["eks/${}/ebs"]

  tags = local.tags

module "key_pair" {
  source  = "terraform-aws-modules/key-pair/aws"
  version = "~> 2.0"

  key_name_prefix    =
  create_private_key = true

  tags = local.tags

resource "aws_security_group" "remote_access" {
  name_prefix = "${}-remote-access"
  description = "Allow remote SSH access"
  vpc_id      = module.vpc.vpc_id

  ingress {
    description = "SSH access"
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = [""]

  egress {
    from_port        = 0
    to_port          = 0
    protocol         = "-1"
    cidr_blocks      = [""]
    ipv6_cidr_blocks = ["::/0"]

  tags = merge(local.tags, { Name = "${}-remote" })

resource "aws_iam_policy" "node_additional" {
  name        = "${}-additional"
  description = "Example usage of node additional policy"

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
        Action = [
        Effect   = "Allow"
        Resource = "*"

  tags = local.tags

data "aws_ami" "eks_default" {
  most_recent = true
  owners      = ["amazon"]

  filter {
    name   = "name"
    values = ["amazon-eks-node-${local.cluster_version}-v*"]

data "aws_ami" "eks_default_arm" {
  most_recent = true
  owners      = ["amazon"]

  filter {
    name   = "name"
    values = ["amazon-eks-arm64-node-${local.cluster_version}-v*"]

data "aws_ami" "eks_default_bottlerocket" {
  most_recent = true
  owners      = ["amazon"]

  filter {
    name   = "name"
    values = ["bottlerocket-aws-k8s-${local.cluster_version}-x86_64-*"]

Steps to reproduce the behavior:

Git clone the repository, cd to /examples/eks_managed_node_group/, create a CLI driven workspace in TFCB, copy & paste the workspace cloud stanza, push AWS credentials to the workspace, run a terraform init, terraform plan, terraform apply.

Expected behavior

The workspace successfully finishes applying and an EKS cluster is created

Actual behavior

After attempting to build module.ebs_kms_key.aws_kms_key.this[0] for roughly 1m50s it returns the following error

╷ │ Error: creating KMS Key: MalformedPolicyDocumentException: Policy contains a statement with one or more invalid principals. │ │ with module.ebs_kms_key.aws_kms_key.this[0], │ on .terraform/modules/ebs_kms_key/ line 8, in resource "aws_kms_key" "this": │ 8: resource "aws_kms_key" "this" { │ ╵ Operation failed: failed running terraform apply (exit 1)

Thus stopping the apply and any dependencies on module.ebs_kms_key are also not created

Additional context

The plan finishes after 73 things have been created, key_pair, vpc, and some parts of the eks module are all built, however the kms_key and all of it's dependencies are blocked due to this failure.

As a note, you also can't destroy this once the apply has failed, as the EKS module has partially built, but on Line 242 it's expecting the module.ebs_kms_key.key_arn. Which as the kms_key failed to build/provide any outputs the terraform init part of the destroy command errors out with

module.ebs_kms_key is object with 5 attributes This object does not have an attribute named "key_arn"

Commenting out line 242 allows you to destroy this. Or adding in a depends_on to the module could circumvent this.

alexmeise commented Mar 26, 2023


If this is a fresh new AWS account that never had an ASG, you might be missing the role:


Try to run this command before doing the apply:

aws iam create-service-linked-role --aws-service-name

The policy for KMS contains this role as a principal
(the ASG should be able to use this KMS to decrypt the EBS volumes from the managed NodeGroup) but it might not exist in the account if you didn't create it or create an ASG before thus creating it manually should solve the issue.

