Default limits Rate limit headers Requesting higher limits Rate limit errors

Rate limits and quotas

Claude Platform on AWS assigns Tier 1 rate limits when you subscribe. Anthropic manages rate limits directly, not through AWS quota systems.

Default limits

Claude Platform on AWS uses Anthropic’s standard tier schedule, identical to the first-party Claude API. Tier 1 limits apply per workspace. Limits are pooled by model family (for example, one combined limit covers Claude Opus 4.7, 4.6, 4.5, and earlier Opus models; Sonnet models share a separate combined limit; Haiku models share another).

For the current Tier 1 values — requests per minute (RPM), input tokens per minute (ITPM), output tokens per minute (OTPM) — and for higher-tier thresholds, see Rate limits on the Anthropic documentation website. The Anthropic page is the source of truth and is updated when limits change.

Rate limit headers

Every response includes headers that report your current rate limit status. Key headers:

anthropic-ratelimit-requests-limit — Maximum requests per minute
anthropic-ratelimit-requests-remaining — Requests remaining in the current window
anthropic-ratelimit-requests-reset — Time when the request limit resets (RFC 3339)
anthropic-ratelimit-tokens-limit — Maximum combined tokens (input + output) per minute
anthropic-ratelimit-tokens-remaining — Combined tokens remaining in the current window
anthropic-ratelimit-tokens-reset — Time when the combined token limit resets (RFC 3339)
anthropic-ratelimit-input-tokens-limit / -remaining / -reset — Input-token-specific headers
anthropic-ratelimit-output-tokens-limit / -remaining / -reset — Output-token-specific headers
retry-after — On a 429 response, the number of seconds to wait before retrying

See Response headers on the Anthropic documentation website for the complete set.

Requesting higher limits

Unlike the first-party Claude API, automatic tier advancement does not apply on Claude Platform on AWS. To request higher limits, contact your Anthropic account representative with your workspace ID and desired throughput. For tier thresholds and other details, see Rate limits on the Anthropic documentation website.

Rate limit errors

When you exceed a rate limit, the API returns HTTP 429 with a rate_limit_error type. Implement exponential backoff with jitter in your retry logic. The retry-after header indicates how many seconds to wait before retrying.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Using the Claude Console

Billing