1
0
mirror of https://github.com/terraform-aws-modules/terraform-aws-eks.git synced 2025-09-09 19:32:58 +08:00

feat!: Upgrade min AWS provider and Terraform versions to 6.0 and 1.5.7 respectively (#3412)

* feat!: Upgrade min AWS provider and Terraform versions to `6.0` and `1.5.7` respectively

* fix: Remove deprecated arguments in AWS v6.0 provider, upgrade Helm provider to v3.0, bump VPC module to v6.0

* fix: Remove `aws-auth` sub-module

* fix: Remove `platform` and `cluster_service_ipv4_cidr` variables from `user-data` sub-module

* fix: Resolve all marked `todos` that have been accumulated

* fix: Set default `http_put_response_hop_limit` to `1`

* fix: Remove IRSA support from Karpenter sub-module

* fix: Avoid making GET requests from data sources unless absolutely necessary

* feat: Add variable optional attribute definitions

* feat: Bump KMS key module version to latest, add remaining variable attribute definitions

* fix: Remove `cluster_` prefix from variable names to better match the underlying API

* fix: Move all EFA logic to the nodegroup itself

* fix: Remove arguments that do not make sense in EKS

* fix: Updates from plan validation

* fix: Remove more self-managed node group attributes that are commonly not used in EKS clusters

* fix: Remove data plane compute `*_defaults` variables that do not work with variable optional attributes

* fix: Ignore changes to `bootstrap_self_managed_addons` to aid in upgrade

* feat: Add support for `region` argument on relevant resources

* feat: Initial pass on upgrade guide

* fix: Updates from testing and validating EKS managed node group

* fix: Updates from testing and validating self-managed node group

* docs: Ensure addon ussage documented is aligned

* feat: Switch to dualstack OIDC issuer URL

* feat: Allow sourcing over overriding the Karpenter assume role policy

* fix: Use `Bool` instead of `StringEquals` for DenyHTTP queue policy

* fix: Correct use of `nullable` and default value propagation
This commit is contained in:
Bryant Biggs
2025-07-23 15:11:01 -05:00
committed by GitHub
parent 8a0efdbbc8
commit 416515a0da
84 changed files with 4111 additions and 3339 deletions
+1
View File
@@ -11,3 +11,4 @@
- [Upgrade to v18.x](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/UPGRADE-18.0.md)
- [Upgrade to v19.x](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/UPGRADE-19.0.md)
- [Upgrade to v20.x](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/UPGRADE-20.0.md)
- [Upgrade to v21.x](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/UPGRADE-21.0.md)
+328
View File
@@ -0,0 +1,328 @@
# Upgrade from v20.x to v21.x
If you have any questions regarding this upgrade process, please consult the [`examples`](https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples) directory:
If you find a bug, please open an issue with supporting configuration to reproduce.
## List of backwards incompatible changes
- Terraform `v1.5.7` is now minimum supported version
- AWS provider `v6.0.0` is now minimum supported version
- TLS provider `v4.0.0` is now minimum supported version
- The `aws-auth` sub-module has been removed. Users who wish to utilize its functionality can continue to do so by specifying a `v20.x` version, or `~> v20.0` version constraint in their module source.
- `bootstrap_self_managed_addons` is now hardcoded to `false`. This is a legacy setting and instead users should utilize the EKS addons API, which is what this module does by default. In conjunction with this change, the `bootstrap_self_managed_addons` is now ignored by the module to aid in upgrading without disruption (otherwise it would require cluster re-creation).
- When enabling `enable_efa_support` or creating placement groups within a node group, users must now specify the correct `subnet_ids`; the module no longer tries to automatically select a suitable subnet.
- EKS managed node group:
- IMDS now default to a hop limit of 1 (previously was 2)
- `ami_type` now defaults to `AL2023_x86_64_STANDARD`
- `enable_monitoring` is now set to `false` by default
- `enable_efa_only` is now set to `true` by default
- `use_latest_ami_release_version` is now set to `true` by default
- Support for autoscaling group schedules has been removed
- Self-managed node group:
- IMDS now default to a hop limit of 1 (previously was 2)
- `ami_type` now defaults to `AL2023_x86_64_STANDARD`
- `enable_monitoring` is now set to `false` by default
- `enable_efa_only` is now set to `true` by default
- Support for autoscaling group schedules has been removed
- Karpenter:
- Native support for IAM roles for service accounts (IRSA) has been removed; EKS Pod Identity is now enabled by default
- Karpenter controller policy for prior to Karpenter `v1` have been removed (i.e. `v0.33`); the `v1` policy is now used by default
- `create_pod_identity_association` is now set to `true` by default
- `addons.resolve_conflicts_on_create` is now set to `"NONE"` by default (was `"OVERWRITE"`).
- `addons.most_recent` is now set to `true` by default (was `false`).
- `cluster_identity_providers.issuer_url` is now required to be set by users; the prior incorrect default has been removed. See https://github.com/terraform-aws-modules/terraform-aws-eks/pull/3055 and https://github.com/kubernetes/kubernetes/pull/123561 for more details.
- The OIDC issuer URL for IAM roles for service accounts (IRSA) has been changed to use the new dual stack`oidc-eks` endpoint instead of `oidc.eks`. This is to align with https://github.com/aws/containers-roadmap/issues/2038#issuecomment-2278450601
## Additional changes
### Added
- Support for `region` parameter to specify the AWS region for the resources created if different from the provider region.
- Both the EKS managed and self-managed node groups now support creating their own security groups (again). This is primarily motivated by the changes for EFA support; previously users would need to specify `enable_efa_support` both at the cluster level (to add the appropriate security group rules to the shared node security group) as well as the node group level. However, its not always desirable to have these rules across ALL node groups when they are really only required on the node group where EFA is utilized. And similarly for other use cases, users can create custom rules for a specific node group instead of apply across ALL node groups.
### Modified
- Variable definitions now contain detailed `object` types in place of the previously used any type.
- The embedded KMS key module definition has been updated to `v4.0` to support the same version requirements as well as the new `region` argument.
### Variable and output changes
1. Removed variables:
- `enable_efa_support` - users only need to set this within the node group configuration, as the module no longer manages EFA support at the cluster level.
- `enable_security_groups_for_pods` - users can instead attach the `arn:aws:iam::aws:policy/AmazonEKSVPCResourceController` policy via `iam_role_additional_policies` if using security groups for pods.
- `eks-managed-node-group` sub-module
- `cluster_service_ipv4_cidr` - users should use `cluster_service_cidr` instead (for either IPv4 or IPv6).
- `elastic_gpu_specifications`
- `elastic_inference_accelerator`
- `platform` - this is superseded by `ami_type`
- `placement_group_strategy` - set to `cluster` by the module
- `placement_group_az` - users will need to specify the correct subnet in `subnet_ids`
- `create_schedule`
- `schedules`
- `self-managed-node-group` sub-module
- `elastic_gpu_specifications`
- `elastic_inference_accelerator`
- `platform` - this is superseded by `ami_type`
- `create_schedule`
- `schedules`
- `placement_group_az` - users will need to specify the correct subnet in `subnet_ids`
- `hibernation_options` - not valid in EKS
- `min_elb_capacity` - not valid in EKS
- `wait_for_elb_capacity` - not valid in EKS
- `wait_for_capacity_timeout` - not valid in EKS
- `default_cooldown` - not valid in EKS
- `target_group_arns` - not valid in EKS
- `service_linked_role_arn` - not valid in EKS
- `warm_pool` - not valid in EKS
- `fargate-profile` sub-module
- None
- `karpenter` sub-module
- `enable_v1_permissions` - v1 permissions are now the default
- `enable_irsa`
- `irsa_oidc_provider_arn`
- `irsa_namespace_service_accounts`
- `irsa_assume_role_condition_test`
2. Renamed variables:
- Variables prefixed with `cluster_*` have been stripped of the prefix to better match the underlying API:
- `cluster_name` -> `name`
- `cluster_version` -> `kubernetes_version`
- `cluster_enabled_log_types` -> `enabled_log_types`
- `cluster_force_update_version` -> `force_update_version`
- `cluster_compute_config` -> `compute_config`
- `cluster_upgrade_policy` -> `upgrade_policy`
- `cluster_remote_network_config` -> `remote_network_config`
- `cluster_zonal_shift_config` -> `zonal_shift_config`
- `cluster_additional_security_group_ids` -> `additional_security_group_ids`
- `cluster_endpoint_private_access` -> `endpoint_private_access`
- `cluster_endpoint_public_access` -> `endpoint_public_access`
- `cluster_endpoint_public_access_cidrs` -> `endpoint_public_access_cidrs`
- `cluster_ip_family` -> `ip_family`
- `cluster_service_ipv4_cidr` -> `service_ipv4_cidr`
- `cluster_service_ipv6_cidr` -> `service_ipv6_cidr`
- `cluster_encryption_config` -> `encryption_config`
- `create_cluster_primary_security_group_tags` -> `create_primary_security_group_tags`
- `cluster_timeouts` -> `timeouts`
- `create_cluster_security_group` -> `create_security_group`
- `cluster_security_group_id` -> `security_group_id`
- `cluster_security_group_name` -> `security_group_name`
- `cluster_security_group_use_name_prefix` -> `security_group_use_name_prefix`
- `cluster_security_group_description` -> `security_group_description`
- `cluster_security_group_additional_rules` -> `security_group_additional_rules`
- `cluster_security_group_tags` -> `security_group_tags`
- `cluster_encryption_policy_use_name_prefix` -> `encryption_policy_use_name_prefix`
- `cluster_encryption_policy_name` -> `encryption_policy_name`
- `cluster_encryption_policy_description` -> `encryption_policy_description`
- `cluster_encryption_policy_path` -> `encryption_policy_path`
- `cluster_encryption_policy_tags` -> `encryption_policy_tags`
- `cluster_addons` -> `addons`
- `cluster_addons_timeouts` -> `addons_timeouts`
- `cluster_identity_providers` -> `identity_providers`
- `eks-managed-node-group` sub-module
- `cluster_version` -> `kubernetes_version`
- `self-managed-node-group` sub-module
- `cluster_version` -> `kubernetes_version`
- `delete_timeout` -> `timeouts`
- `fargate-profile` sub-module
- None
- `karpenter` sub-module
- None
3. Added variables:
- `region`
- `eks-managed-node-group` sub-module
- `region`
- `partition` - added to reduce number of `GET` requests from data sources when possible
- `account_id` - added to reduce number of `GET` requests from data sources when possible
- `create_security_group`
- `security_group_name`
- `security_group_use_name_prefix`
- `security_group_description`
- `security_group_ingress_rules`
- `security_group_egress_rules`
- `security_group_tags`
- `self-managed-node-group` sub-module
- `region`
- `partition` - added to reduce number of `GET` requests from data sources when possible
- `account_id` - added to reduce number of `GET` requests from data sources when possible
- `create_security_group`
- `security_group_name`
- `security_group_use_name_prefix`
- `security_group_description`
- `security_group_ingress_rules`
- `security_group_egress_rules`
- `security_group_tags`
- `fargate-profile` sub-module
- `region`
- `partition` - added to reduce number of `GET` requests from data sources when possible
- `account_id` - added to reduce number of `GET` requests from data sources when possible
- `karpenter` sub-module
- `region`
4. Removed outputs:
- `eks-managed-node-group` sub-module
- `platform` - this is superseded by `ami_type`
- `autoscaling_group_schedule_arns`
- `self-managed-node-group` sub-module
- `platform` - this is superseded by `ami_type`
- `autoscaling_group_schedule_arns`
- `fargate-profile` sub-module
- None
- `karpenter` sub-module
- None
5. Renamed outputs:
- `eks-managed-node-group` sub-module
- None
- `self-managed-node-group` sub-module
- None
- `fargate-profile` sub-module
- None
- `karpenter` sub-module
- None
6. Added outputs:
- `eks-managed-node-group` sub-module
- `security_group_arn`
- `security_group_id`
- `self-managed-node-group` sub-module
- `security_group_arn`
- `security_group_id`
- `fargate-profile` sub-module
- None
- `karpenter` sub-module
- None
## Upgrade Migrations
### Before 20.x Example
```hcl
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 20.0"
# Truncated for brevity ...
# Renamed variables are not shown here, please refer to the full list above.
enable_efa_support = true
eks_managed_node_group_defaults = {
iam_role_additional_policies = {
AmazonSSMManagedInstanceCore = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}
}
eks_managed_node_groups = {
efa = {
ami_type = "AL2023_x86_64_NVIDIA"
instance_types = ["p5e.48xlarge"]
enable_efa_support = true
enable_efa_only = true
}
}
self_managed_node_groups = {
example = {
use_mixed_instances_policy = true
mixed_instances_policy = {
instances_distribution = {
on_demand_base_capacity = 0
on_demand_percentage_above_base_capacity = 0
on_demand_allocation_strategy = "lowest-price"
spot_allocation_strategy = "price-capacity-optimized"
}
# ASG configuration
override = [
{
instance_requirements = {
cpu_manufacturers = ["intel"]
instance_generations = ["current", "previous"]
spot_max_price_percentage_over_lowest_price = 100
vcpu_count = {
min = 1
}
allowed_instance_types = ["t*", "m*"]
}
}
]
}
}
}
}
```
### After 21.x Example
```hcl
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 21.0"
# Truncated for brevity ...
# Renamed variables are not shown here, please refer to the full list above.
eks_managed_node_groups = {
efa = {
ami_type = "AL2023_x86_64_NVIDIA"
instance_types = ["p5e.48xlarge"]
iam_role_additional_policies = {
AmazonSSMManagedInstanceCore = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}
enable_efa_support = true
subnet_ids = element(module.vpc.private_subnets, 0)
}
}
self_managed_node_groups = {
example = {
use_mixed_instances_policy = true
mixed_instances_policy = {
instances_distribution = {
on_demand_base_capacity = 0
on_demand_percentage_above_base_capacity = 0
on_demand_allocation_strategy = "lowest-price"
spot_allocation_strategy = "price-capacity-optimized"
}
# ASG configuration
# Need to wrap in `launch_template` now
launch_template = {
override = [
{
instance_requirements = {
cpu_manufacturers = ["intel"]
instance_generations = ["current", "previous"]
spot_max_price_percentage_over_lowest_price = 100
vcpu_count = {
min = 1
}
allowed_instance_types = ["t*", "m*"]
}
}
]
}
}
}
}
}
```
### State Changes
No state changes required.
+24 -51
View File
@@ -56,22 +56,34 @@ Refer to the [EKS Managed Node Group documentation](https://docs.aws.amazon.com/
```hcl
eks_managed_node_groups = {
custom_ami = {
ami_id = "ami-0caf35bc73450c396"
ami_id = "ami-0caf35bc73450c396"
ami_type = "AL2023_x86_64_STANDARD"
# By default, EKS managed node groups will not append bootstrap script;
# this adds it back in using the default template provided by the module
# Note: this assumes the AMI provided is an EKS optimized AMI derivative
enable_bootstrap_user_data = true
pre_bootstrap_user_data = <<-EOT
export FOO=bar
EOT
cloudinit_pre_nodeadm = [{
content = <<-EOT
---
apiVersion: node.eks.aws/v1alpha1
kind: NodeConfig
spec:
kubelet:
config:
shutdownGracePeriod: 30s
EOT
content_type = "application/node.eks.aws"
}]
# Because we have full control over the user data supplied, we can also run additional
# scripts/configuration changes after the bootstrap script has been run
post_bootstrap_user_data = <<-EOT
echo "you are free little kubelet!"
EOT
# This is only possible when `ami_id` is specified, indicating a custom AMI
cloudinit_post_nodeadm = [{
content = <<-EOT
echo "All done"
EOT
content_type = "text/x-shellscript; charset=\"us-ascii\""
}]
}
}
```
@@ -113,9 +125,9 @@ Refer to the [Self Managed Node Group documentation](https://docs.aws.amazon.com
1. The `self-managed-node-group` uses the latest AWS EKS Optimized AMI (Linux) for the given Kubernetes version by default:
```hcl
cluster_version = "1.33"
kubernetes_version = "1.33"
# This self managed node group will use the latest AWS EKS Optimized AMI for Kubernetes 1.27
# This self managed node group will use the latest AWS EKS Optimized AMI for Kubernetes 1.33
self_managed_node_groups = {
default = {}
}
@@ -124,7 +136,7 @@ Refer to the [Self Managed Node Group documentation](https://docs.aws.amazon.com
2. To use Bottlerocket, specify the `ami_type` as one of the respective `"BOTTLEROCKET_*" types` and supply a Bottlerocket OS AMI:
```hcl
cluster_version = "1.33"
kubernetes_version = "1.33"
self_managed_node_groups = {
bottlerocket = {
@@ -139,42 +151,3 @@ See the [`examples/self-managed-node-group/` example](https://github.com/terrafo
### Fargate Profiles
Fargate profiles are straightforward to use and therefore no further details are provided here. See the [`tests/fargate-profile/` tests](https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/tests/fargate-profile) for a working example of various configurations.
### Default Configurations
Each type of compute resource (EKS managed node group, self managed node group, or Fargate profile) provides the option for users to specify a default configuration. These default configurations can be overridden from within the compute resource's individual definition. The order of precedence for configurations (from highest to least precedence):
- Compute resource individual configuration
- Compute resource family default configuration (`eks_managed_node_group_defaults`, `self_managed_node_group_defaults`, `fargate_profile_defaults`)
- Module default configuration (see `variables.tf` and `node_groups.tf`)
For example, the following creates 4 AWS EKS Managed Node Groups:
```hcl
eks_managed_node_group_defaults = {
ami_type = "AL2_x86_64"
disk_size = 50
instance_types = ["m6i.large", "m5.large", "m5n.large", "m5zn.large"]
}
eks_managed_node_groups = {
# Uses module default configurations overridden by configuration above
default = {}
# This further overrides the instance types used
compute = {
instance_types = ["c5.large", "c6i.large", "c6d.large"]
}
# This further overrides the instance types and disk size used
persistent = {
disk_size = 1024
instance_types = ["r5.xlarge", "r6i.xlarge", "r5b.xlarge"]
}
# This overrides the OS used
bottlerocket = {
ami_type = "BOTTLEROCKET_x86_64"
}
}
```
+33 -10
View File
@@ -12,23 +12,44 @@
`disk_size`, and `remote_access` can only be set when using the EKS managed node group default launch template. This module defaults to providing a custom launch template to allow for custom security groups, tag propagation, etc. If you wish to forgo the custom launch template route, you can set `use_custom_launch_template = false` and then you can set `disk_size` and `remote_access`.
### I received an error: `expect exactly one securityGroup tagged with kubernetes.io/cluster/<NAME> ...`
### I received an error: `expect exactly one securityGroup tagged with kubernetes.io/cluster/<CLUSTER_NAME> ...`
⚠️ `<CLUSTER_NAME>` would be the name of your cluster
By default, EKS creates a cluster primary security group that is created outside of the module and the EKS service adds the tag `{ "kubernetes.io/cluster/<CLUSTER_NAME>" = "owned" }`. This on its own does not cause any conflicts for addons such as the AWS Load Balancer Controller until users decide to attach both the cluster primary security group and the shared node security group created by the module (by setting `attach_cluster_primary_security_group = true`). The issue is not with having multiple security groups in your account with this tag key:value combination, but having multiple security groups with this tag key:value combination attached to nodes in the same cluster. There are a few ways to resolve this depending on your use case/intentions:
⚠️ `<CLUSTER_NAME>` below needs to be replaced with the name of your cluster
1. If you want to use the cluster primary security group, you can disable the creation of the shared node security group with:
```hcl
create_node_security_group = false # default is true
attach_cluster_primary_security_group = true # default is false
create_node_security_group = false # default is true
eks_managed_node_group = {
example = {
attach_cluster_primary_security_group = true # default is false
}
}
# Or for self-managed
self_managed_node_group = {
example = {
attach_cluster_primary_security_group = true # default is false
}
}
```
2. By not attaching the cluster primary security group. The cluster primary security group has quite broad access and the module has instead provided a security group with the minimum amount of access to launch an empty EKS cluster successfully and users are encouraged to open up access when necessary to support their workload.
```hcl
attach_cluster_primary_security_group = false # this is the default for the module
eks_managed_node_group = {
example = {
attach_cluster_primary_security_group = true # default is false
}
}
# Or for self-managed
self_managed_node_group = {
example = {
attach_cluster_primary_security_group = true # default is false
}
}
```
In theory, if you are attaching the cluster primary security group, you shouldn't need to use the shared node security group created by the module. However, this is left up to users to decide for their requirements and use case.
@@ -58,6 +79,8 @@ If you require a public endpoint, setting up both (public and private) and restr
The module is configured to ignore this value. Unfortunately, Terraform does not support variables within the `lifecycle` block. The setting is ignored to allow autoscaling via controllers such as cluster autoscaler or Karpenter to work properly and without interference by Terraform. Changing the desired count must be handled outside of Terraform once the node group is created.
:info: See [this](https://github.com/bryantbiggs/eks-desired-size-hack) for a workaround to this limitation.
### How do I access compute resource attributes?
Examples of accessing the attributes of the compute resource(s) created by the root module are shown below. Note - the assumption is that your cluster module definition is named `eks` as in `module "eks" { ... }`:
@@ -90,6 +113,10 @@ aws eks describe-addon-versions --query 'addons[*].addonName'
### What configuration values are available for an add-on?
> [!NOTE]
> The available configuration values will vary between add-on versions,
> typically more configuration values will be added in later versions as functionality is enabled by EKS.
You can retrieve the configuration value schema for a given addon using the following command:
```sh
@@ -286,7 +313,3 @@ Returns (at the time of writing):
}
}
```
> [!NOTE]
> The available configuration values will vary between add-on versions,
> typically more configuration values will be added in later versions as functionality is enabled by EKS.
+1 -1
View File
@@ -27,7 +27,7 @@ See the example snippet below which adds additional security group rules to the
```hcl
...
# Extend cluster security group rules
cluster_security_group_additional_rules = {
security_group_additional_rules = {
egress_nodes_ephemeral_ports_tcp = {
description = "To node 1025-65535"
protocol = "tcp"
+20 -4
View File
@@ -10,7 +10,8 @@ Users can see the various methods of using and providing user data through the [
- AMI types of `BOTTLEROCKET_*`, user data must be in TOML format
- AMI types of `WINDOWS_*`, user data must be in powershell/PS1 script format
- Self Managed Node Groups
- `AL2_x86_64` AMI type (default) -> the user data template (bash/shell script) provided by the module is used as the default; users are able to provide their own user data template
- `AL2_*` AMI types -> the user data template (bash/shell script) provided by the module is used as the default; users are able to provide their own user data template
- `AL2023_*` AMI types -> the user data template (MIME multipart format) provided by the module is used as the default; users are able to provide their own user data template
- `BOTTLEROCKET_*` AMI types -> the user data template (TOML file) provided by the module is used as the default; users are able to provide their own user data template
- `WINDOWS_*` AMI types -> the user data template (powershell/PS1 script) provided by the module is used as the default; users are able to provide their own user data template
@@ -24,9 +25,24 @@ When using an EKS managed node group, users have 2 primary routes for interactin
- Users can use the following variables to facilitate this process:
```hcl
pre_bootstrap_user_data = "..."
```
For `AL2_*`, `BOTTLEROCKET_*`, and `WINDOWS_*`:
```hcl
pre_bootstrap_user_data = "..."
```
For `AL2023_*`
```hcl
cloudinit_pre_nodeadm = [{
content = <<-EOT
---
apiVersion: node.eks.aws/v1alpha1
kind: NodeConfig
spec:
...
EOT
content_type = "application/node.eks.aws"
}]
```
2. If a custom AMI is used, then per the [AWS documentation](https://docs.aws.amazon.com/eks/latest/userguide/launch-templates.html#launch-template-custom-ami), users will need to supply the necessary user data to bootstrap and register nodes with the cluster when launched. There are two routes to facilitate this bootstrapping process:
- If the AMI used is a derivative of the [AWS EKS Optimized AMI ](https://github.com/awslabs/amazon-eks-ami), users can opt in to using a template provided by the module that provides the minimum necessary configuration to bootstrap the node when launched: