Help improve this page
To contribute to this user guide, choose the Edit this page on GitHub link that is located in the right pane of every page.
Cost optimization in EKS Auto Mode
EKS Auto Mode continuously optimizes your cluster’s compute costs through consolidation, bin-packing, and right-sizing. However, certain workload configurations can prevent these optimizations. This topic explains how cost optimization works, what can block it, and how to configure your cluster to maintain cost efficiency.
How EKS Auto Mode optimizes cost
EKS Auto Mode reduces compute costs through the following mechanisms:
-
Bin-packing — When scheduling pods onto nodes, EKS Auto Mode selects instance types that closely match the aggregate resource requests, minimizing unused capacity.
-
Consolidation — EKS Auto Mode periodically evaluates running nodes and replaces or removes them when workloads can run on fewer or less expensive instances.
-
Right-sizing — As workloads scale down, EKS Auto Mode consolidates pods onto smaller nodes and terminates underutilized instances.
These optimizations run continuously without manual intervention. However, certain pod annotations and NodePool configurations can prevent consolidation from taking effect.
Built-in node pools and cost guardrails
The built-in general-purpose and system node pools already enforce several cost-protective defaults:
-
Instance families restricted to C, M, and R — No accelerated (P, G, Inf, Trn) or exotic instance types are permitted.
-
On-demand capacity only — No Spot instances, which avoids interruption-driven churn but also means no Spot savings.
-
Generation 5 or newer — Older, less cost-efficient instance generations are excluded.
If you are using only the built-in node pools, you already benefit from these guardrails. The guidance in this topic on excluding instance families and constraining instance sizes is most relevant when you create custom NodePools, which do not inherit these restrictions.
However, even with built-in node pools, the following sections still apply to you:
-
What blocks consolidation — The
do-not-disruptannotation and restrictive PDBs block consolidation regardless of which NodePool provisioned the node. -
Use NodePool limits as a cost ceiling — The built-in node pools do not have resource
limitsconfigured. If your workloads can scale significantly, consider creating a custom NodePool with limits rather than relying on the unbounded built-in pools. -
Node lifecycle and cost — Node replacement overlap applies to all nodes, including those provisioned by built-in pools.
| Guardrail | Built-in node pools | Custom NodePools |
|---|---|---|
|
Accelerated instance exclusion |
Enforced |
You must configure |
|
Instance size limits |
Not set |
You must configure |
|
Resource |
Not set |
You must configure |
|
On-demand only |
Enforced |
You choose (Spot/On-Demand) |
|
Consolidation protection ( |
Your responsibility |
Your responsibility |
What blocks consolidation
Consolidation is blocked when EKS Auto Mode determines that disrupting a node would violate a workload’s availability requirements. The following configurations prevent consolidation:
The do-not-disrupt annotation
The karpenter.sh/do-not-disrupt annotation instructs EKS Auto Mode to preserve a node as long as the annotated pod is running on it. This prevents the node from being consolidated, replaced, or terminated, even if the node is underutilized.
metadata: annotations: karpenter.sh/do-not-disrupt: "true"
Important
Cost implication: When a pod carries the do-not-disrupt annotation, the node it runs on is exempt from consolidation. This means:
-
The node continues to run at its current instance size regardless of actual utilization.
-
vCPU and memory usage on that node can remain elevated even as the workload’s demand decreases.
-
If multiple pods across many nodes carry this annotation, cluster-wide consolidation is significantly reduced, leading to sustained higher costs.
The do-not-disrupt annotation is an availability mechanism. It does not account for cost. Use it only for workloads where mid-execution disruption causes data loss or significant rework — for example, long-running batch jobs or stateful processes without checkpointing.
Alternatives to consider:
-
Pod Disruption Budgets (PDBs) — Use PDBs to control the rate of disruption rather than blocking it entirely. PDBs allow consolidation to proceed while ensuring a minimum number of replicas remain available.
-
Shorter-lived workloads — For CI/CD runners and build agents, allow disruption and rely on your CI system’s built-in retry logic rather than using
do-not-disrupt. -
Time-limited annotations — Apply
do-not-disruptonly for the duration of a critical operation, then remove it programmatically when the operation completes.
Pod Disruption Budgets (PDBs)
PDBs that set maxUnavailable: 0 or minAvailable equal to the current replica count effectively block all consolidation for the affected pods. Review your PDBs to ensure they permit at least one pod to be disrupted at a time.
Use NodePool limits as a cost ceiling
NodePool limits set a hard ceiling on the total compute resources that a NodePool can provision. When the limit is reached, EKS Auto Mode stops launching new nodes for that NodePool. This happens even if pods are pending.
Use limits as a cost guardrail, particularly for NodePools that serve non-production, test, or bursty workloads where unbounded scaling is not appropriate.
apiVersion: karpenter.sh/v1 kind: NodePool metadata: name: ci-runners spec: template: spec: nodeClassRef: group: eks.amazonaws.com kind: NodeClass name: default requirements: - key: "eks.amazonaws.com/instance-category" operator: In values: ["c", "m"] limits: cpu: "500" memory: 1000Gi
In this example, the ci-runners NodePool cannot exceed 500 vCPUs or 1000 GiB of memory in total across all nodes it provisions. Pods that exceed this limit remain in Pending state until capacity is freed.
Tip
Set limits based on your expected maximum burst size plus a buffer for node replacement. Review your NodePool utilization regularly and adjust limits as workload patterns change.
Exclude instance families for cost control
By default, EKS Auto Mode selects from a broad range of instance types to maximize scheduling flexibility. For workloads that do not require specialized hardware, restrict the instance families to prevent expensive instance types from being launched.
Exclude accelerated instances
If your workloads do not request GPU or accelerator resources, exclude accelerated instance families from your NodePool. This prevents scenarios where accelerated instances are selected during capacity constraints.
spec: template: spec: requirements: - key: "eks.amazonaws.com/instance-category" operator: In values: ["c", "m", "r"]
By specifying only compute-optimized, general-purpose, and memory-optimized categories, you exclude accelerated (P, G, Inf, Trn) and other specialized instance families from selection.
How instance selection interacts with capacity constraints
EKS Auto Mode deprioritizes accelerated and exotic instance types during normal instance selection. However, when sustained launch failures occur, EKS Auto Mode will launch from any remaining available instance type to prioritize workload availability. For example, this happens when EC2 service quotas are temporarily exhausted for all preferred instance types.
To prevent this fallback behavior, explicitly constrain your NodePool requirements to only the instance categories your workload needs. When preferred types are unavailable and no other types are permitted by your NodePool configuration, pods remain in Pending state rather than being scheduled onto expensive instances.
Constrain instance sizes
In addition to restricting instance families, you can limit the maximum instance size within your NodePool. Constraining instance sizes limits the cost exposure from any single node that cannot be consolidated. For example, a node blocked by a do-not-disrupt annotation cannot shrink even if its workload is small.
Use the eks.amazonaws.com/instance-cpu label to limit maximum instance sizes in your NodePool requirements:
requirements: - key: "eks.amazonaws.com/instance-cpu" operator: Lte values: ["32"]
This configuration prevents EKS Auto Mode from launching instances larger than 32 vCPUs in this NodePool.
To identify optimization opportunities in an existing cluster, review your largest running instances. If large nodes are consistently blocked from consolidation, the per-node cost of that idle capacity is proportionally higher.
Recommended patterns for bursty workloads
CI/CD pipelines, batch jobs, and ephemeral runners create burst-and-idle patterns that require specific configuration to maintain cost efficiency.
Recommended defaults for bursty workloads
| Configuration | Recommendation |
|---|---|
|
|
Do not use for CI/CD runners. Rely on your CI system’s retry and queue mechanisms instead. |
|
NodePool |
Set a CPU/memory ceiling based on your maximum expected concurrency plus buffer for node replacement overlap. |
|
Instance categories |
Restrict to |
|
Instance sizes |
Consider constraining to moderate sizes (for example, 4–32 vCPUs) to limit cost exposure from any single node that gets blocked from consolidation. |
|
Consolidation timing |
Use the default |
|
Capacity type |
Use Spot instances for fault-tolerant runners. Combine with On-Demand for build agents that hold state during execution. |
Example: CI runner NodePool
apiVersion: karpenter.sh/v1 kind: NodePool metadata: name: ci-runners spec: template: spec: nodeClassRef: group: eks.amazonaws.com kind: NodeClass name: default requirements: - key: "eks.amazonaws.com/instance-category" operator: In values: ["c", "m"] - key: "eks.amazonaws.com/instance-cpu" operator: Lte values: ["32"] - key: "karpenter.sh/capacity-type" operator: In values: ["spot", "on-demand"] limits: cpu: "500" memory: 1000Gi disruption: consolidationPolicy: WhenEmptyOrUnderutilized consolidateAfter: 30s
This configuration:
-
Restricts to cost-effective instance families
-
Caps total NodePool capacity at 500 vCPUs
-
Allows aggressive consolidation (30 seconds after pods are removed)
-
Permits both Spot and On-Demand capacity
Node lifecycle and cost
EKS Auto Mode replaces nodes through graceful disruption when they drift from their desired specification (for example, after the release of a new Auto Mode AMI) or when the node’s lifetime approaches expiry. During graceful replacement:
-
A new replacement node is launched and becomes ready.
-
Pods are drained from the old node, respecting Pod Disruption Budgets.
-
For a brief period, both the old and replacement nodes are running simultaneously.
For clusters with large or many nodes, this overlap can create periodic cost increases. To minimize the impact:
-
Review disruption budgets — Ensure your disruption budgets permit timely draining. Restrictive budgets extend the overlap period during which both old and new nodes are running.
-
Right-size instances — Smaller instances reduce the absolute cost of the overlap window.
-
Reduce maximum node lifetime — Shorter expiry values (for example, 7 days) create more frequent but smaller replacement events. This spreads the cost more evenly over time rather than concentrating it.
For more information about node lifecycle, see Learn about Amazon EKS Auto Mode Managed instances.