最佳化效能和成本的服務層 - Amazon Bedrock

本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。

最佳化效能和成本的服務層

Amazon Bedrock 為模型推論提供四個服務層:預留、優先順序、標準和彈性。透過服務層,您可以最佳化可用性、成本和效能。

預留層

預留層可讓您為任務關鍵應用程式預留優先順序運算容量,而這些應用程式無法容忍任何停機時間。您可以彈性配置不同的輸入和輸出tokens-per-minute容量,以符合工作負載和控制成本的確切需求。當您的應用程式每分鐘需要比您預留更多的tokens-per-minute容量時,服務會自動溢位到 Standard 層,以確保不間斷的操作。預留層以模型回應的 99.5% 執行時間為目標。客戶可以保留 1 個月或 3 個月的容量。客戶每分鐘每 1K tokens-per-minute支付固定價格,並按月計費。

若要存取預留方案,請聯絡您的 AWS 帳戶團隊。

注意

帳單會持續進行,直到您在 AWS 帳戶 經理的協助下刪除預留方案保留為止。

優先順序層級

Priority 方案提供比標準隨需定價更快速的價格溢價回應時間。它最適合具有面對客戶的業務工作流程的任務關鍵應用程式,這些工作流程不需要24X7的容量保留。優先順序方案不需要事先保留。您可以直接將 "service_tier" 選用參數設定為 "priority",以利用請求層級優先順序。優先順序方案請求的優先順序高於標準和 Flex 方案請求。

標準方案

Standard 層為內容產生、文字分析和例行文件處理等日常 AI 任務提供一致的效能。根據預設,當缺少 "service_tier" 參數時,所有推論請求都會路由至 Standard 層。您也可以將「service_tier」選用參數設定為「default」,讓推論請求與 Standard 層搭配使用。

Flex 層

對於可以處理較長處理時間的工作負載,Flex 層提供符合成本效益的定價折扣處理。這可協助您最佳化工作負載的成本,例如模型評估、內容摘要和代理程式工作流程。您可以設定「service_tier」選用參數為「flex」,讓您的推論請求可搭配 Flex 方案使用,並提供定價折扣。

使用服務層功能

若要存取服務層功能,您可以在呼叫 Amazon Bedrock 執行時間 API 時,將 "service_tier" 選用參數設定為 "reserved"、"priority"、"default" 或 "flex"。

"service_tier" : "reserved | priority | default | flex"

模型的隨需配額會跨「優先順序」、「預設」和「彈性」服務層共用。您的「預留」方案容量保留與隨需配額不同。服務請求的服務層組態會顯示在 API 回應和 AWS CloudTrail Events 中。您也可以在 ModelId、ServiceTier 和 ResolvedServiceTier 下檢視 Amazon CloudWatch Metrics 中的服務層指標,其中 ResolvedServiceTier 會顯示提供您請求的實際層。

如需有關定價的詳細資訊,請造訪定價頁面

預留服務層支援的模型和區域:

供應商 模型 模型 ID 大區 (Regions)
Anthropic Claude Sonnet 4.6

global.anthropic.claude-sonnet-4-6

us.anthropic.claude-sonnet-4-6

eu.anthropic.claude-sonnet-4-6

ap-northeast-1
ap-northeast-2
ap-northeast-3
ap-southeast-1
ap-southeast-2
ap-south-1
ap-southeast-3
ap-south-2
ap-southeast-4
ca-central-1
eu-west-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-1
us-west-2
me-south-1
ap-southeast-7
af-south-1
me-central-1
ap-southeast-5
mx-central-1
il-central-1
ap-east-2
ca-west-1
Anthropic Claude Opus 4.6

global.anthropic.claude-opus-4-6-v1

us.anthropic.claude-opus-4-6-v1

eu.anthropic.claude-opus-4-6-v1

af-south-1
ap-east-2
ap-northeast-1
ap-northeast-2
ap-northeast-3
ap-south-1
ap-south-2
ap-southeast-1
ap-southeast-2
ap-southeast-3
ap-southeast-4
ap-southeast-5
ap-southeast-7
ca-central-1
ca-west-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
il-central-1
me-central-1
me-south-1
mx-central-1
sa-east-1
us-east-1
us-east-2
us-west-1
us-west-2
Anthropic Claude Sonnet 4.5

global.anthropic.claude-sonnet-4-5-20250929-v1:0

us.anthropic.claude-sonnet-4-5-20250929-v1:0

eu.anthropic.claude-sonnet-4-5-20250929-v1:0

us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0

ap-northeast-1
ap-northeast-2
ap-northeast-3
ap-southeast-1
ap-southeast-2
ap-south-1
ap-southeast-3
ap-south-2
ap-southeast-4
ca-central-1
eu-west-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-1
us-west-2
us-gov-west-1
Anthropic Claude Opus 4.5

global.anthropic.claude-opus-4-5-20251101-v1:0

us.anthropic.claude-opus-4-5-20251101-v1:0

eu.anthropic.claude-opus-4-5-20251101-v1:0

ap-northeast-1
ap-northeast-2
ap-northeast-3
ap-southeast-1
ap-southeast-2
ap-south-1
ap-southeast-3
ap-south-2
ap-southeast-4
ca-central-1
eu-west-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-1
us-west-2
Anthropic Claude Haiku 4.5

global.anthropic.claude-haiku-4-5-20251001-v1:0

us.anthropic.claude-haiku-4-5-20251001-v1:0

eu.anthropic.claude-haiku-4-5-20251001-v1:0

ap-northeast-1
ap-northeast-2
ap-northeast-3
ap-southeast-1
ap-southeast-2
ap-south-1
ap-southeast-3
ap-south-2
ap-southeast-4
ca-central-1
eu-west-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-1
us-west-2
注意

預留層不支援 Sonnet 4.5 的 1M 內容長度。

Priority 和 Flex 服務方案支援的模型和區域:

供應商 模型 模型 ID 大區 (Regions)
OpenAI gpt-oss-120b openai.gpt-oss-120b-1:0 us-east-1
us-east-2
us-west-2
ap-northeast-1
ap-south-1
ap-southeast-3
eu-central-1
eu-north-1
eu-south-1
eu-west-1
eu-west-2
sa-east-1
OpenAI gpt-oss-20b openai.gpt-oss-20b-1:0 us-east-1
us-east-2
us-west-2
ap-northeast-1
ap-south-1
ap-southeast-3
eu-central-1
eu-north-1
eu-south-1
eu-west-1
eu-west-2
sa-east-1
OpenAI GPT OSS Safeguard 20B openai.gpt-oss-safeguard-20b ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
OpenAI GPT OSS Safeguard 120B openai.gpt-oss-safeguard-120b ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Qwen Qwen3 235B A22B 2507 qwen.qwen3-235b-a22b-2507-v1:0 us-east-2
us-west-2
ap-northeast-1
ap-south-1
ap-southeast-3
eu-central-1
eu-north-1
eu-south-1
eu-west-2
Qwen Qwen3 Coder 480B A35B Instruct qwen.qwen3-coder-480b-a35b-v1:0 us-east-2
us-west-2
ap-northeast-1
ap-south-1
ap-southeast-3
eu-north-1
eu-west-2
Qwen Qwen3-Coder-30B-A3B-Instruct qwen.qwen3-coder-30b-a3b-v1:0 us-east-1
us-east-2
us-west-2
ap-northeast-1
ap-south-1
ap-southeast-3
eu-central-1
eu-north-1
eu-south-1
eu-west-1
eu-west-2
sa-east-1
Qwen Qwen3 32B (dense) qwen.qwen3-32b-v1:0 us-east-1
us-east-2
us-west-2
ap-northeast-1
ap-south-1
ap-southeast-3
eu-central-1
eu-north-1
eu-south-1
eu-west-1
eu-west-2
sa-east-1
Qwen Qwen3 Next 80B A3B qwen.qwen3-next-80b-a3b ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Qwen Qwen3 VL 235B A22B qwen.qwen3-vl-235b-a22b ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
DeepSeek DeepSeek-V3.1 deepseek.v3-v1:0 us-east-2
us-west-2
ap-northeast-1
ap-south-1
ap-southeast-3
eu-north-1
eu-west-2
Amazon Nova Premier amazon.nova-premier-v1:0 us-east-1*
us-east-2*
us-west-2*
Amazon Nova Pro amazon.nova-pro-v1:0 us-east-1
us-east-2*
us-west-1*
us-west-2*
ap-east-2*
ap-northeast-1*
ap-northeast-2*
ap-south-1*
ap-southeast-1*
ap-southeast-2
ap-southeast-3
ap-southeast-4*
ap-southeast-5*
ap-southeast-7*
eu-central-1*
eu-north-1*
eu-south-1*
eu-south-2*
eu-west-1*
eu-west-2
eu-west-3*
il-central-1*
me-central-1
Amazon Nova 2 Lite amazon.nova-2-lite-v1:0 ap-east-2
ap-northeast-1
ap-northeast-2
ap-south-1
ap-southeast-1
ap-southeast-2
ap-southeast-3
ap-southeast-4
ap-southeast-5
ap-southeast-7
ca-central-1
ca-west-1
eu-central-1
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
il-central-1
me-central-1
us-east-1
us-east-2
us-west-1
us-west-2
Amazon Nova 2 Pro Preview amazon.nova-2-pro-preview-20251202-v1:0 ap-east-2
ap-northeast-1
ap-northeast-2
ap-south-1
ap-southeast-1
ap-southeast-2
ap-southeast-3
ap-southeast-4
ap-southeast-5
ap-southeast-7
ca-central-1
ca-west-1
eu-central-1
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
il-central-1
me-central-1
us-east-1
us-east-2
us-west-1
us-west-2
Amazon Nova Lite 2 Omni amazon.nova-2-lite-omni-v1 ap-east-2
ap-northeast-1
ap-northeast-2
ap-south-1
ap-southeast-1
ap-southeast-2
ap-southeast-3
ap-southeast-4
ap-southeast-5
ap-southeast-7
ca-central-1
ca-west-1
eu-central-1
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
il-central-1
me-central-1
us-east-1
us-east-2
us-west-1
us-west-2
Google Gemma 3 4B google.gemma-3-4b-it ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Google Gemma 3 12B google.gemma-3-12b-it ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Google Gemma 3 27B google.gemma-3-27b-it ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Minimax AI Minimax M2 minimax.minimax-m2 ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Mistral Magistral Small 1.2 mistral.magistral-small-2509 ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Mistral Voxtral Mini 1.0 mistral.voxtral-mini-3b-2507 ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Mistral Voxtral Small 1.0 mistral.voxtral-small-24b-2507 ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Mistral Ministral 3B 3.0 mistral.ministral-3-3b-instruct ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Mistral Ministral 8B 3.0 mistral.ministral-3-8b-instruct ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Mistral Ministral 14B 3.0 mistral.ministral-3-14b-instruct ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Mistral Mistral Large 3 mistral.mistral-large-3-675b-instruct ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Kimi AI Kimi K2 Thinking moonshot.kimi-k2-thinking ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Nvidia NVIDIA Nemotron Nano 2 nvidia.nemotron-nano-9b-v2 ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2
Nvidia NVIDIA Nemotron Nano 2 VL nvidia.nemotron-nano-12b-v2 ap-northeast-1
ap-south-1
ap-southeast-2
ap-southeast-3
ca-central-1
eu-central-1
eu-central-2
eu-north-1
eu-south-1
eu-south-2
eu-west-1
eu-west-2
eu-west-3
sa-east-1
us-east-1
us-east-2
us-west-2

*模型推論可以使用多個區域提供。

若要控制對服務層的存取,請參閱 控制對服務層的存取