Idempotency and retries¶
Replay and retry can each run the same operation more than once. An operation with a side effect repeats the side effect on every run. Idempotency describes operations where the effect remains the same regardless of how many times they run.
At-most-once vs at-least-once¶
A step has an execution semantic that controls what happens when an attempt fails to complete, for example when the Lambda sandbox dies, the network drops or the runtime hits the invocation timeout mid-invocation.
- At-least-once per retry (default). The SDK initiates a
STARTcheckpoint and then immediately runs the code inside the step without waiting for the checkpoint response. If the attempt does not complete due to an interruption, the step runs again on replay. This is safe only for idempotent operations. - At-most-once per retry. The SDK waits for confirmation that the
STARTcheckpoint committed before running the code inside the step. If the attempt does not complete due to an interruption, the SDK marks the step as interrupted on replay and raises aStepInterruptederror instead of re-executing.
Both semantics are per retry attempt. Neither guarantees the step runs exactly once across the entire workflow. A retry strategy that retries on failure will run the step again even under at-most-once per retry. To limit a step to a single execution attempt end-to-end, combine at-most-once with a no-retry strategy.
Tip
At-least-once is the default. Any interruption re-runs the step on replay, so the code inside must be safe to run more than once.
Match the semantic with the side effect¶
- At-least-once is for idempotent operations, such as reads that do not mutate, writes to an upsert-capable store, calls that accept an idempotency key and anything safe to re-run.
- At-most-once is for operations with external side effects such as charging a payment card, sending a one-shot SMS or a POST to a non-idempotent API.
import { StepSemantics } from "@aws/durable-execution-sdk-js";
// At-least-once (default) for a retryable idempotent write.
await context.step("upsert-user", async () => {
return userStore.upsert(event.user);
});
// At-most-once for a side-effecting call, with retries disabled.
await context.step(
"charge-payment",
async () => paymentService.charge(event.amount, event.cardToken),
{
semantics: StepSemantics.AtMostOncePerRetry,
retryStrategy: () => ({ shouldRetry: false }),
},
);
from aws_durable_execution_sdk_python.config import StepConfig, StepSemantics
from aws_durable_execution_sdk_python.retries import RetryPresets
# At-least-once (default) for a retryable idempotent write.
context.step(upsert_user(event["user"]), name="upsert-user")
# At-most-once for a side-effecting call, with retries disabled.
context.step(
charge_payment(event["amount"], event["card_token"]),
name="charge-payment",
config=StepConfig(
step_semantics=StepSemantics.AT_MOST_ONCE_PER_RETRY,
retry_strategy=RetryPresets.none(),
),
)
import software.amazon.lambda.durable.config.StepConfig;
import software.amazon.lambda.durable.config.StepSemantics;
import software.amazon.lambda.durable.retry.RetryStrategies;
// At-least-once (default).
context.step("upsert-user", User.class,
ctx -> userStore.upsert(input.user()));
// At-most-once with retries disabled.
StepConfig critical = StepConfig.builder()
.semantics(StepSemantics.AT_MOST_ONCE_PER_RETRY)
.retryStrategy(RetryStrategies.Presets.NO_RETRY)
.build();
context.step("charge-payment", Receipt.class,
ctx -> paymentService.charge(input.amount(), input.cardToken()),
critical);
Warning
At-most-once applies per attempt, not per workflow. Combine it with a no-retry strategy to guarantee the step runs exactly once end-to-end.
Idempotency tokens¶
For external services that support an idempotency key, such as most modern payment APIs, generate a key inside a step once and pass it to every attempt of the side-effecting step. The external service deduplicates repeated requests with the same key, so even at-least-once retries are safe.
Warning
Generate the key inside a step. A key generated outside a step changes on replay, which defeats deduplication and doubles up on retry.
import uuid
@durable_step
def generate_key(ctx: StepContext) -> str:
return str(uuid.uuid4())
@durable_step
def charge_with_key(ctx: StepContext, amount: float, card_token: str, key: str) -> dict:
return payment_service.charge(amount=amount, card_token=card_token, idempotency_key=key)
key = context.step(generate_key(), name="idempotency-key")
receipt = context.step(
charge_with_key(event["amount"], event["card_token"], key),
name="charge",
)
The same pattern applies to any operation that writes to an external store. Include the key in the write and rely on the store's deduplication.
Tip
When an idempotency-enabled API returns a duplicate-request error on retry, it usually means the first attempt already succeeded. Handle that error as success, not failure.
Database retry patterns¶
When you own the database, you can make writes idempotent without tokens.
- Conditional writes. DynamoDB
PutItemwithattribute_not_exists, SQLINSERT ... ON CONFLICT DO NOTHING, or upserts keyed by a stable ID. - Check then write in a single transaction. Use a transaction that reads the current state and writes the new state atomically.
- Append-only logs. Write events with a deterministic event ID. Readers deduplicate on ID.
Avoid INSERT without a uniqueness constraint, and avoid counter increments without a
conditional check if a retry could apply them twice.
See also¶
- Determinism and replay Why operations can run more than once to begin with.
- Step design Retry strategy and the boundary between steps.
- Errors and retries