本文属于机器翻译版本。若本译文内容与英语原文存在差异，则一律以英文原文为准。

# 容量、限制和成本优化
<a name="capacity-limits-cost-optimization"></a>

Amazon Bedrock 提供灵活的容量选项，以满足您的工作负载要求和预算。了解按需层（弹性、优先级、标准）、预留层、批处理和跨区域推理之间的差异有助于您优化性能和成本。

# 用于优化性能和成本的服务分层
<a name="service-tiers-inference"></a>

Amazon Bedrock 为模型推断提供了四个服务等级：预留、优先级、标准和弹性。通过服务层，您可以针对可用性、成本和性能进行优化。

## 预留等级
<a name="w2aac26b5b5"></a>

Reserved 层允许您为无法容忍任何停机的任务关键型应用程序预留按优先顺序排列的计算容量。您可以灵活地分配不同的输入和输出 tokens-per-minute容量，以满足工作负载的确切要求并控制成本。当您的应用程序需要的 tokens-per-minute容量超过预留容量时，该服务会自动溢出到标准层，从而确保不间断运行。预留层的目标是模型响应的正常运行时间为 99.5%。客户可以预留 1 个月或 3 个月的容量。客户按固定价格每 1K 支付 tokens-per-minute，按月计费。

要获得预留套餐的访问权限，请联系您的 AWS 账户团队。

**注意**  
账单将一直持续到您在 AWS 账户 经理的帮助下删除预留套餐预留为止。

## 优先等级
<a name="w2aac26b5b7"></a>

优先级可提供最快的响应时间，而且价格高于标准按需定价。它最适合具有面向客户的业务工作流程且不保证全天候容量预留的关键任务应用程序。优先等级不需要事先预订。您只需将 “service\$1tier” 可选参数设置为 “优先级”，即可使用请求级别的优先级。优先等级请求的优先级高于标准和弹性级别的请求。

## 标准等级
<a name="w2aac26b5b9"></a>

标准层为内容生成、文本分析和例行文档处理等日常 AI 任务提供稳定的性能。默认情况下，当缺少 “service\$1tier” 参数时，所有推理请求都会路由到标准层。您也可以将 “service\$1tier” 可选参数设置为 “默认”，以便使用标准套餐处理您的推理请求。

## 弹性等级
<a name="w2aac26b5c11"></a>

对于可以处理更长处理时间的工作负载，Flex 层可提供经济实惠的处理能力，并享受定价折扣。这可以帮助您优化模型评估、内容摘要和代理工作流程等工作负载的成本。您可以将 “service\$1tier” 可选参数设置为 “flex”，这样您的推理请求将与 Flex 层一起提供，并享受定价折扣。

## 使用服务层功能
<a name="w2aac26b5c13"></a>

要访问服务层功能，您可以在调用 Amazon Bedrock 运行时 API 时将 “service\$1tier” 可选参数设置为 “预留”、“优先级”、“默认” 或 “弹性”。

```
"service_tier" : "reserved | priority | default | flex"
```

您的模型按需配额在 “优先级”、“默认” 和 “弹性” 服务层之间共享。您的 “预留” 套餐容量预留与按需配额是分开的。已处理请求的服务等级配置在 API 响应和 AWS CloudTrail 事件中可见。您还可以在、和 ModelId、 ServiceTier下的 Amazon Metrics 中查看服务等级 CloudWatch 指标 ResolvedServiceTier，其中 ResolvedServiceTier 显示了满足您请求的实际等级。

有关定价的更多信息，请访问[定价页面](https://aws.amazon.com/bedrock/pricing/)。

预留服务套餐支持的型号和区域：


|  |  |  |  | 
| --- |--- |--- |--- |
| Provider | 模型 | 模型 IDs | 区域 | 
| Anthropic | 克劳德·十四行诗 4.6 | global.andropic.claude-sonnet-4-6us.anthropic.claude-sonnet-4-6eu.anthropic.claude-sonnet-4-6 | ap-northeast-1 | 
| ap-northeast-2 | 
| ap-northeast-3 | 
| ap-southeast-1 | 
| ap-southeast-2 | 
| ap-south-1 | 
| ap-southeast-3 | 
| ap-south-2 | 
| ap-southeast-4 | 
| ca-central-1 | 
| eu-west-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-1 | 
| us-west-2 | 
| me-south-1 | 
| ap-southeast-7 | 
| af-south-1 | 
| me-central-1 | 
| ap-southeast-5 | 
| mx-central-1 | 
| il-central-1 | 
| ap-east-2 | 
| ca-west-1 | 
| Anthropic | 克劳德作品 4.6 | global.anthropic.claude-opus-4-6-v1us.anthropic.claude-opus-4-6-v1eu.anthropic.claude-opus-4-6-v1 | af-south-1 | 
| ap-east-2 | 
| ap-northeast-1 | 
| ap-northeast-2 | 
| ap-northeast-3 | 
| ap-south-1 | 
| ap-south-2 | 
| ap-southeast-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ap-southeast-4 | 
| ap-southeast-5 | 
| ap-southeast-7 | 
| ca-central-1 | 
| ca-west-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| il-central-1 | 
| me-central-1 | 
| me-south-1 | 
| mx-central-1 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-1 | 
| us-west-2 | 
| Anthropic | 克劳德·十四行诗 4.5 | global.anthropic.claude-sonnet-4-5-20250929-v 1:0us.anthropic.claude-sonnet-4-5-20250929-v 1:0eu.anthropic.claude-sonnet-4-5-20250929-v 1:0us-gov.anthropic.claude-sonnet-4-5-20250929-v 1:0 | ap-northeast-1 | 
| ap-northeast-2 | 
| ap-northeast-3 | 
| ap-southeast-1 | 
| ap-southeast-2 | 
| ap-south-1 | 
| ap-southeast-3 | 
| ap-south-2 | 
| ap-southeast-4 | 
| ca-central-1 | 
| eu-west-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-1 | 
| us-west-2 | 
| us-gov-west-1 | 
| Anthropic | 克劳德作品 4.5 | global.anthropic.claude-opus-4-5-20251101-v 1:0us.anthropic.claude-opus-4-5-20251101-v 1:0eu.anthropic.claude-opus-4-5-20251101-v 1:0 | ap-northeast-1 | 
| ap-northeast-2 | 
| ap-northeast-3 | 
| ap-southeast-1 | 
| ap-southeast-2 | 
| ap-south-1 | 
| ap-southeast-3 | 
| ap-south-2 | 
| ap-southeast-4 | 
| ca-central-1 | 
| eu-west-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-1 | 
| us-west-2 | 
| Anthropic | Claude Haiku 4.5 | global.anthropic.claude-haiku-4-5-20251001-v 1:0us.anthropic.claude-haiku-4-5-20251001-v 1:0eu.anthropic.claude-haiku-4-5-20251001-v 1:0 | ap-northeast-1 | 
| ap-northeast-2 | 
| ap-northeast-3 | 
| ap-southeast-1 | 
| ap-southeast-2 | 
| ap-south-1 | 
| ap-southeast-3 | 
| ap-south-2 | 
| ap-southeast-4 | 
| ca-central-1 | 
| eu-west-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-1 | 
| us-west-2 | 

**注意**  
保留层不支持 Sonnet 4.5 的 1M 上下文长度。

优先级和弹性服务等级支持的型号和区域：


|  |  |  |  | 
| --- |--- |--- |--- |
| Provider | 模型 | 型号标识 | 区域 | 
| OpenAI | gpt-oss-120b | openai.gpt-oss-120b-1:0 | us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-3 | 
| eu-central-1 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-west-1 | 
| eu-west-2 | 
| sa-east-1 | 
| OpenAI | gpt-oss-20b | openai.gpt-oss-20b-1:0 | us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-3 | 
| eu-central-1 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-west-1 | 
| eu-west-2 | 
| sa-east-1 | 
| OpenAI | GPT OSS Safeguard | openai。 gpt-oss-safeguard-20b | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| OpenAI | GPT OSS Safeguard | openai。 gpt-oss-safeguard-120b | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Qwen | Qwen3 235B A22B 2507 | qwen.qwen3-235b-a22b-2507-v 1:0 | us-east-2 | 
| us-west-2 | 
| ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-3 | 
| eu-central-1 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-west-2 | 
| Qwen | Qwen3 Coder 480B A35B Instruct | qwen.qwen3-coder-480b-a35b-v 1:0 | us-east-2 | 
| us-west-2 | 
| ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-3 | 
| eu-north-1 | 
| eu-west-2 | 
| Qwen | Qwen3-Coder-30B-A3B-Instruct | qwen.qwen3-coder-30b-a3b-v 1:0 | us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-3 | 
| eu-central-1 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-west-1 | 
| eu-west-2 | 
| sa-east-1 | 
| Qwen | Qwen3 32B（密集） | qwen.qwen3-32b-v 1:0 | us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-3 | 
| eu-central-1 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-west-1 | 
| eu-west-2 | 
| sa-east-1 | 
| Qwen | Qwen3 Next 80B A3B | qwen.qwen3-next-80b-a3b | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Qwen | Qwen3 VL 235B A2B | qwen.qwen3-vl-235b-a22b | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| DeepSeek | DeepSeek-V3.1 | deepseek.v3-v 1:0 | us-east-2 | 
| us-west-2 | 
| ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-3 | 
| eu-north-1 | 
| eu-west-2 | 
| Amazon | Nova Premier | 亚马逊。 nova-premier-v1:0 | us-east-1\$1 | 
| us-east-2\$1 | 
| us-west-2\$1 | 
| Amazon | Nova Pro | 亚马逊。 nova-pro-v1:0 | us-east-1 | 
| us-east-2\$1 | 
| us-west-1\$1 | 
| us-west-2\$1 | 
| ap-east-2\$1 | 
| ap-northeast-1\$1 | 
| ap-northeast-2\$1 | 
| ap-south-1\$1 | 
| ap-southeast-1\$1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ap-southeast-4\$1 | 
| ap-southeast-5\$1 | 
| ap-southeast-7\$1 | 
| eu-central-1\$1 | 
| eu-north-1\$1 | 
| eu-south-1\$1 | 
| 欧盟南方2\$1 | 
| eu-west-1\$1 | 
| eu-west-2 | 
| eu-west-3\$1 | 
| il-central-1\$1 | 
| me-central-1 | 
| Amazon | 新星 2 精简版 | amazon.nova-2-lite-v 1:0 | ap-east-2 | 
| ap-northeast-1 | 
| ap-northeast-2 | 
| ap-south-1 | 
| ap-southeast-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ap-southeast-4 | 
| ap-southeast-5 | 
| ap-southeast-7 | 
| ca-central-1 | 
| ca-west-1 | 
| eu-central-1 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| il-central-1 | 
| me-central-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-1 | 
| us-west-2 | 
| Amazon | Nova 2 专业版预览 | amazon.nova-2-pro-preview-20251202-v 1:0 | ap-east-2 | 
| ap-northeast-1 | 
| ap-northeast-2 | 
| ap-south-1 | 
| ap-southeast-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ap-southeast-4 | 
| ap-southeast-5 | 
| ap-southeast-7 | 
| ca-central-1 | 
| ca-west-1 | 
| eu-central-1 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| il-central-1 | 
| me-central-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-1 | 
| us-west-2 | 
| Amazon | 新星精简版 2 Omni | amazon.nova-2-1 lite-omni-v | ap-east-2 | 
| ap-northeast-1 | 
| ap-northeast-2 | 
| ap-south-1 | 
| ap-southeast-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ap-southeast-4 | 
| ap-southeast-5 | 
| ap-southeast-7 | 
| ca-central-1 | 
| ca-west-1 | 
| eu-central-1 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| il-central-1 | 
| me-central-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-1 | 
| us-west-2 | 
| Google | Gemma 3 4B | google.gemma-3-4b-it | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Google | Gemma 3 12B | google.gemma-3-12b-it | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Google | Gemma 3 27B | google.gemma-3-27b-it | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Minimax AI | Minimax M2 | minimax.minimax-m2 | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Mistral | 魔法小号 1.2 | mistral.magistral-small-2509 | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Mistral | Voxtral Mini 1.0 | mistral.voxtral-mini-3b-2507 | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Mistral | Voxtral 小型 1.0 | mistral.voxtral-small-24b-2507 | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Mistral | Ministral 3B 3.0 | mistral.ministral-3-3b-instruct | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Mistral | Ministral 8B 3.0 | mistral.ministral-3-8b-instruct | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Mistral | Ministral 14B 3.0 | mistral.ministral-3-14b-instruct | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Mistral | Mistral 大号 3 | mistral.mistral-large-3-675b-instruct | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Kimi AI | Kimi K2 Thinking | moonshot.kimi-k2 思考 | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Nvidia | 英伟达 Nemotron Nano 2 | nvidia.nemotron-nano-9b-v2 | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 
| Nvidia | NVIDIA Nemotron Nano 2 VL | nvidia.nemotron-nano-12b-v2 | ap-northeast-1 | 
| ap-south-1 | 
| ap-southeast-2 | 
| ap-southeast-3 | 
| ca-central-1 | 
| eu-central-1 | 
| eu-central-2 | 
| eu-north-1 | 
| eu-south-1 | 
| eu-south-2 | 
| eu-west-1 | 
| eu-west-2 | 
| eu-west-3 | 
| sa-east-1 | 
| us-east-1 | 
| us-east-2 | 
| us-west-2 | 

 \$1模型推断可以使用多个区域提供。

要控制对服务层的访问权限，请参阅 [控制对服务层的访问权限](security_iam_id-based-policy-examples-agent.md#security_iam_id-based-policy-examples-service-tiers)

## 容量选项
<a name="capacity-options"></a>


| 容量类型 | 使用场景 | 主要特征 | 
| --- | --- | --- | 
| 点播：Flex | 零星的低容量工作负载 |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/zh_cn/bedrock/latest/userguide/capacity-limits-cost-optimization.html)  | 
| 点播：标准 | 常规生产工作负载 |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/zh_cn/bedrock/latest/userguide/capacity-limits-cost-optimization.html)  | 
| 点播：优先级 | 高优先级、对延迟敏感的应用程序 |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/zh_cn/bedrock/latest/userguide/capacity-limits-cost-optimization.html)  | 
| 预留等级 | 一致的高容量工作负载 |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/zh_cn/bedrock/latest/userguide/capacity-limits-cost-optimization.html)  | 
| Batch | 大规模、 non-time-sensitive加工 |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/zh_cn/bedrock/latest/userguide/capacity-limits-cost-optimization.html)  | 
| 跨区域推理 | 高可用性，流量激增 |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/zh_cn/bedrock/latest/userguide/capacity-limits-cost-optimization.html)  | 

## 限制和配额
<a name="limits-quotas"></a>

### 按需限制（按等级划分）
<a name="on-demand-limits"></a>


| Tier | 转速范围 | TPM 范围 | 节流风险 | 
| --- | --- | --- | --- | 
| 屈伸 | 10-100 | 5K-50K | 高 | 
| 标准 | 100-500 | 50K-150K | 中 | 
| 优先级 | 500-1000\$1 | 150K-300K\$1 | 低 | 
+ 突发容量：适用于短峰期，适用于所有等级
+ 软限制：可通过服务配额申请提高限制
+ 特定型号：实际极限因基础模型而异

### 预留等级限制
<a name="reserved-tier-limits"></a>
+ 最低承诺：1 个模型单元
+ 最大单位：账户和地区特定
+ 输入/输出代币限制：基于购买的单位
+ 在购买的容量内没有 RPM 限制

### 批处理限制
<a name="batch-processing-limits"></a>
+ Job 规模：每批最多 10,000 条记录
+ 文件大小：输入文件最大 200 MB
+ 处理时间：24 小时完成窗口
+ 并发作业：特定区域的配额

### 跨区域推理
<a name="cross-region-inference-limits"></a>
+ 继承每个区域的按需套餐限制
+ 没有额外的配额开销
+ 自动路由（无需手动限制管理）

## 成本优化
<a name="cost-optimization"></a>

### 决策框架
<a name="decision-framework"></a>


| 场景 | 推荐选项 | 为什么 | 
| --- | --- | --- | 
| 开发/测试 | 屈伸 | 成本最低，非生产环境可接受 | 
| 标准生产 | 标准 | 最佳性价比平衡 | 
| 面向用户的关键应用程序 | 优先级 | 可靠性和性能胜过成本 | 
| 稳定的大容量负载 | 预留等级 | 承诺可节省 30-50% | 
| 批量数据处理 | Batch | 50% 折扣，非紧急工作负载 | 
| 关键任务正常运行时间 | 跨区域推理 | 可用性 > 成本 | 

### 优化策略
<a name="optimization-strategies"></a>

**选择合适的按需套餐**
+ 对于大多数工作负载，从标准版开始
+ 针对 dev/test 环境降级到 Flex
+ 只有在限制影响用户时才升级到优先级
+ 监控 CloudWatch 油门指标以为决策提供依据

**过渡到预留等级**
+ 当持续负载超过按需成本的 40% 时
+ 计算收支平衡：（每月按需成本）与（预留承诺）
+ 最初使用 1 个月的订阅期
+ 预留套餐可以与任何按需套餐一起使用

**利用 Batch 获得**
+ 训练数据生成
+ 内容审核待办事项
+ 报告生成
+ 数据充实管道

**组合方法**
+ 为基准流量预留套餐
+ 中度爆发的标准按需配置
+ 关键高峰时段按需优先处理
+ Batch 用于离线处理
+ 仅用于故障转移的跨区域

**成本监控**
+ 比较等级费用：Flex < 标准 < 优先级
+ 跟踪每个请求的代币（优化提示）
+ 使用 CloudWatch 指标来衡量利用率和限制
+ 为意外峰值设置账单警报
+ 每月查看预留等级使用率
+ 仅在出现限制时才评估等级升级

# 使用批量推理处理多个提示
<a name="batch-inference"></a>

使用批量推理，您可以提交多个提示并异步生成响应。您可以使用`InvokeModel`或 `Converse` API 格式来格式化输入数据。批量推理通过发送单个请求并在 Amazon S3 存储桶中生成响应，帮助您高效地处理大量请求。在您创建的文件中定义模型输入后，您需要将相应文件上传到 S3 存储桶。然后，您需要提交批量推理请求并指定 S3 存储桶。作业完成后，您可以从 S3 检索输出文件。您可以使用批量推理来提高对大型数据集的模型推理性能。

**注意**  
预置模型不支持批量推理。

有关批量推理的一般信息，请参阅以下资源：
+ 要查看批量推理的定价，请参阅 [Amazon Bedrock 定价](https://aws.amazon.com/bedrock/pricing/)。
+ 要查看批量推理的配额，请参阅 AWS 一般参考中的 [Amazon Bedrock endpoints and quotas](https://docs.aws.amazon.com/general/latest/gr/bedrock.html)。
+ 要在批量推理作业完成或状态更改而不是轮询时接收通知，请参阅[使用亚马逊监控 Amazon Bedrock 作业状态的变化 EventBridge监控事件变化](monitoring-eventbridge.md)。

**Topics**
+ [支持批量推理的区域和模型](batch-inference-supported.md)
+ [批量推理的先决条件](batch-inference-prereq.md)
+ [创建批量推理作业](batch-inference-create.md)
+ [监控批量推理作业](batch-inference-monitor.md)
+ [停止批量推理作业](batch-inference-stop.md)
+ [查看批量推理作业的结果](batch-inference-results.md)
+ [批量推理的代码示例](batch-inference-example.md)
+ [使用 OpenAI 批处理 API 批量提交提示](inference-openai-batch.md)

# 支持批量推理的区域和模型
<a name="batch-inference-supported"></a>

以下列表提供了关于 Amazon Bedrock 中区域和模型支持的一般信息的链接：
+ 有关 Amazon Bedrock 支持的区域代码和端点的列表，请参阅 [Amazon Bedrock 端点和配额](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#bedrock_region)。
+ 有关在调用 Amazon Bedrock API 操作时 IDs 要使用的亚马逊 Bedrock 模型列表，请参阅。[Amazon Bedrock 中支持的根基模型](models-supported.md)
+ 有关在调用 Amazon Bedrock API 操作时 IDs 要使用的亚马逊 Bedrock 推理配置文件列表，请参阅。[支持的跨区域推理配置文件](inference-profiles-support.md#inference-profiles-support-system)

Batch 推理可用于不同类型的模型。以下列表描述了对不同类型的 Amazon Bedrock 型号的支持：
+ **单区域模型支持**-列出支持向一个区域中的基础模型发送推理请求的区域。 AWS 有关 Amazon Bedrock 上可用型号的完整列表，请参阅[Amazon Bedrock 中支持的根基模型](models-supported.md)。
+ **跨区域推理配置文件支持**-列出支持使用跨区域推理配置文件的区域，跨区域推理配置文件支持向地理区域内多个 AWS 区域的基础模型发送推理请求。推理配置文件在模型 ID 前有一个前缀，表示其地理区域（例如`us.`，`apac`）。有关 Amazon Bedrock 中可用推理配置文件的更多信息，请参阅。[支持推理配置文件的区域和模型](inference-profiles-support.md)
+ **自定义模型支持**-列出支持向自定义模型发送推理请求的区域。有关模型自定义的更多信息，请参阅[针对使用案例自定义模型以提高其性能](custom-models.md)。

下表汇总了对批量推理的支持：


| Provider | 模型 | 模型 ID | 支持单区域模型 | 跨区域推理配置文件支持 | 自定义模型支持 | 
| --- | --- | --- | --- | --- | --- | 
| Amazon | Amazon Nova 多模式嵌入式 | amazon.nova-2-1:0 multimodal-embeddings-v |  us-east-1  |  | 不适用 | 
| Amazon | 新星 2 精简版 | amazon.nova-2-lite-v 1:0 | 不适用 |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5 ap-southeast-7 ca-central-1 ca-west-1 eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3 il-central-1 me-central-1 us-east-1 us-east-2 us-west-1 us-west-2  | 不适用 | 
| Amazon | Nova Lite | 亚马逊。 nova-lite-v1:0 |  me-central-1 us-east-1 us-gov-west-1  |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5 ap-southeast-7 ca-central-1 ca-west-1 eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3 il-central-1 me-central-1 us-east-1 us-east-2 us-west-1 us-west-2  | 不适用 | 
| Amazon | Nova Micro | 亚马逊。 nova-micro-v1:0 |  us-east-1 us-gov-west-1  |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-5 ap-southeast-7 eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3 il-central-1 me-central-1 us-east-1 us-east-2 us-west-2  | 不适用 | 
| Amazon | Nova Premier | 亚马逊。 nova-premier-v1:0 | 不适用 |  us-east-1 us-east-2 us-west-2  | 不适用 | 
| Amazon | Nova Pro | 亚马逊。 nova-pro-v1:0 |  ap-southeast-3 me-central-1 us-east-1 us-gov-west-1  |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5 ap-southeast-7 eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3 il-central-1 me-central-1 us-east-1 us-east-2 us-west-1 us-west-2  | 不适用 | 
| Amazon | Titan Multimodal Embeddings G1 | 亚马逊。 titan-embed-image-v1 |  ap-south-1 ap-southeast-2 ca-central-1 eu-central-1 eu-west-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-west-2  |  |  us-east-1 us-west-2  | 
| Amazon | Titan Text Embeddings V2 | 亚马逊。 titan-embed-text-v2:0 |  ap-northeast-1 ap-northeast-2 ca-central-1 eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-2 sa-east-1 us-east-1 us-west-2  |  | 不适用 | 
| Anthropic | Claude 3 Haiku | anthropic.claude-3-haiku-20240307-v1:0 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 ca-central-1 eu-central-1 eu-central-2 eu-west-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-west-2  | 不适用 | 不适用 | 
| Anthropic | Claude 3 Opus | anthropic.claude-3-opus-20240229-v 1:0 |  us-west-2  |  us-east-1  | 不适用 | 
| Anthropic | Claude 3 Sonnet | anthropic.claude-3-sonnet-20240229-v1:0 |  ap-northeast-2 ap-south-1 ap-southeast-2 ca-central-1 eu-central-1 eu-west-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-west-2  |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 eu-central-1 eu-west-1 eu-west-3 us-east-1 us-west-2  | 不适用 | 
| Anthropic | Claude 3.5 Haiku | anthropic.claude-3-5-haiku-20241022-v1:0 |  us-west-2  |  us-east-1  | 不适用 | 
| Anthropic | Claude 3.5 Sonnet | anthropic.claude-3-5-sonnet-20240620-v1:0 |  ap-northeast-1 ap-northeast-2 ap-southeast-1 eu-central-1 us-east-1 us-east-2 us-west-2  |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2 eu-central-1 eu-west-1 eu-west-3 us-east-1 us-west-2  | 不适用 | 
| Anthropic | Claude 3.5 Sonnet v2 | anthropic.claude-3-5-sonnet-20241022-v2:0 |  us-west-2  |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 us-east-1 us-east-2 us-west-2  | 不适用 | 
| Anthropic | Claude 3.7 Sonnet | anthropic.claude-3-7-sonnet-20250219-v1:0 | 不适用 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 eu-central-1 eu-north-1 eu-west-1 eu-west-3 us-east-1 us-east-2 us-west-2  | 不适用 | 
| Anthropic | Claude Haiku 4.5 | anthropic.claude-haiku-4-5-20251001-v1:0 | 不适用 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ca-central-1 eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-east-2 us-west-1 us-west-2  | 不适用 | 
| Anthropic | 克劳德作品 4.5 | anthropic.claude-opus-4-5-20251101-v 1:0 | 不适用 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ca-central-1 eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-east-2 us-west-1 us-west-2  | 不适用 | 
| Anthropic | 克劳德作品 4.6 | anthropic.claude-opus-4-6-v1 | 不适用 |  af-south-1 ap-east-2 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5 ap-southeast-7 ca-central-1 ca-west-1 eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3 il-central-1 me-central-1 me-south-1 mx-central-1 sa-east-1 us-east-1 us-east-2 us-west-1 us-west-2  | 不适用 | 
| Anthropic | Claude Sonnet 4 | anthropic.claude-sonnet-4-20250514-v1:0 | 不适用 |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5 ap-southeast-7 eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3 il-central-1 me-central-1 us-east-1 us-east-2 us-west-1 us-west-2  | 不适用 | 
| Anthropic | 克劳德·十四行诗 4.5 | anthropic.claude-sonnet-4-5-20250929-v1:0 | 不适用 |  af-south-1 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ca-central-1 ca-west-1 eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3 me-south-1 mx-central-1 sa-east-1 us-east-1 us-east-2 us-gov-east-1 us-gov-west-1 us-west-1 us-west-2  | 不适用 | 
| Anthropic | 克劳德·十四行诗 4.6 | anthropic.claude-sonnet-4-6 |  eu-west-2  |  af-south-1 ap-east-2 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5 ap-southeast-7 ca-central-1 ca-west-1 eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3 il-central-1 me-central-1 me-south-1 mx-central-1 sa-east-1 us-east-1 us-east-2 us-west-1 us-west-2  | 不适用 | 
| DeepSeek | DeepSeek V3.2 | deepseek.v3.2 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-north-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| DeepSeek | DeepSeek-V3.1 | deepseek.v3-v 1:0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-north-1 eu-west-2 us-east-2 us-west-2  |  | 不适用 | 
| Google | Gemma 3 12B IT | google.gemma-3-12b-it |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| Google | Gemma 3 27B PT | google.gemma-3-27b-it |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| Google | Gemma 3 4B IT | google.gemma-3-4b-it |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| Meta | Llama 3.1 405B Instruct | meta.llama3-1-405 1:0 b-instruct-v |  us-west-2  |  | 不适用 | 
| Meta | Llama 3.1 70B Instruct | meta.llama3-1-70 1:0 b-instruct-v |  us-west-2  |  us-east-1 us-west-2  | 不适用 | 
| Meta | Llama 3.1 8B Instruct | meta.llama3-1-8 1:0 b-instruct-v |  us-west-2  |  us-east-1 us-west-2  | 不适用 | 
| Meta | Llama 3.2 11B Instruct | meta.llama3-2-11 1:0 b-instruct-v |  |  us-east-1 us-west-2  | 不适用 | 
| Meta | Llama 3.2 1B Instruct | meta.llama3-2-1 1:0 b-instruct-v |  |  eu-central-1 eu-west-1 eu-west-3 us-east-1 us-west-2  | 不适用 | 
| Meta | Llama 3.2 3B Instruct | meta.llama3-2-3 1:0 b-instruct-v |  |  eu-central-1 eu-west-1 eu-west-3 us-east-1 us-west-2  | 不适用 | 
| Meta | Llama 3.2 90B Instruct | meta.llama3-2-90 1:0 b-instruct-v |  |  us-east-1 us-west-2  | 不适用 | 
| Meta | Llama 3.3 70B Instruct | meta.llama3-3-70 1:0 b-instruct-v |  us-east-2  |  us-east-1 us-east-2 us-west-2  | 不适用 | 
| Meta | Llama 4 Maverick 17B Instruct | meta.llama4-maverick b-instruct-v -17 1:0 |  |  us-east-1 us-east-2 us-west-1 us-west-2  | 不适用 | 
| Meta | Llama 4 Scout 17B Instruct | meta.llama4-scout b-instruct-v -17 1:0 |  |  us-east-1 us-east-2 us-west-1 us-west-2  | 不适用 | 
| MiniMax | MiniMax M2 | minimax.minimax-M2 |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| MiniMax | MiniMax M2.1 | minimax.minimax-M2.1 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| Mistral AI | Devstral 2 123B | mistral.devstral-2-123b |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| Mistral AI | Magistral 小号 2509 | mistral.magistral-small-2509 |  ap-northeast-1 ap-south-1 ap-southeast-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| Mistral AI | Ministral 14B 3.0 | mistral.ministral-3-14b-instruct |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| Mistral AI | Ministral 3 8B | mistral.ministral-3-8b-instruct |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| Mistral AI | Ministral 3B | mistral.ministral-3-3b-instruct |  ap-northeast-1 ap-south-1 ap-southeast-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| Mistral AI | Mistral Large（24.07） | mistral.mistral-large-2407-v1:0 |  us-west-2  | 不适用 | 不适用 | 
| Mistral AI | Mistral 大号 3 | mistral.mistral-large-3-675b-instruct |  ap-northeast-1 ap-south-1 ap-southeast-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| Mistral AI | Mistral Small（24.02） | mistral.mistral-small-2402-v1:0 |  us-east-1  | 不适用 | 不适用 | 
| Mistral AI | Voxtral Mini 3B 2507 | mistral.voxtral-mini-3b-2507 |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| Mistral AI | Voxtral Small 24B 2507 | mistral.voxtral-small-24b-2507 |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| 月球射击人工智能 | Kimi K2 Thinking | moonshot.kimi-k2 思考 |  ap-northeast-1 ap-south-1 ap-southeast-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| 月球射击人工智能 | Kimi K2.5 | moonshotai.kimi-k2.5 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-north-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| NVIDIA | NVIDIA Nemotron Nano 12B v2 VL BF16 | nvidia.nemotron-nano-12b-v2 |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| NVIDIA | NVIDIA Nemotron Nano 9B v2 | nvidia.nemotron-nano-9b-v2 |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| NVIDIA | Nemotron Nano 3 30B | nvidia.nemotron-nano-3-30b |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| OpenAI | GPT OSS Safeguard | openai。 gpt-oss-safeguard-120b |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| OpenAI | GPT OSS Safeguard | openai。 gpt-oss-safeguard-20b |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| OpenAI | gpt-oss-120b | openai.gpt-oss-120b-1:0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-gov-west-1 us-west-2  | 不适用 | 不适用 | 
| OpenAI | gpt-oss-20b | openai.gpt-oss-20b-1:0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-gov-west-1 us-west-2  | 不适用 | 不适用 | 
| Qwen | Qwen3 235B A22B 2507 | qwen.qwen3-235b-a22b-2507-v 1:0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-2 us-east-2 us-west-2  | 不适用 | 不适用 | 
| Qwen | Qwen3 32B（密集） | qwen.qwen3-32b-v 1:0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| Qwen | Qwen3 Coder 480B A35B Instruct | qwen.qwen3-coder-480b-a35b-v 1:0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-north-1 eu-west-2 us-east-2 us-west-2  | 不适用 | 不适用 | 
| Qwen | Qwen3 Coder 下一篇 | qwen.qwen3-coder-next |  ap-southeast-2 eu-west-2 us-east-1  | 不适用 | 不适用 | 
| Qwen | Qwen3 Next 80B A3B | qwen.qwen3-next-80b-a3b |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| Qwen | Qwen3 VL 235B A2B | qwen.qwen3-vl-235b-a22b |  ap-northeast-1 ap-south-1 ap-southeast-2 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| Qwen | Qwen3-Coder-30B-A3B-Instruct | qwen.qwen3-coder-30b-a3b-v 1:0 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| Z.AI | GLM 4.7 | zai.glm-4.7 |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-north-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 
| Z.AI | GLM 4.7 Flash | zai.glm-4.7-Flash |  ap-northeast-1 ap-south-1 ap-southeast-2 ap-southeast-3 eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-2 sa-east-1 us-east-1 us-east-2 us-west-2  | 不适用 | 不适用 | 

# 批量推理的先决条件
<a name="batch-inference-prereq"></a>

要执行批量推理，您必须满足以下先决条件：

1. 准备好您的数据集并将其上传到 Amazon S3 存储桶。

1. 为您的输出数据创建 S3 存储桶。

1. 为相关 IAM 身份设置与批量推理相关的权限。

1. （可选）设置 VPC 以在执行批量推理时保护 S3 中的数据。如果您不需要使用 VPC，则可以跳过这一步。

要了解如何满足这些先决条件，请参阅以下主题：

**Topics**
+ [设置格式并上传批量推理数据](batch-inference-data.md)
+ [批量推理所需权限](batch-inference-permissions.md)
+ [使用 VPC 保护批量推理作业](batch-vpc.md)

# 设置格式并上传批量推理数据
<a name="batch-inference-data"></a>

您必须将批量推理数据添加到在提交模型调用作业时要选择或指定的 S3 位置。S3 位置必须包含以下项目：
+ 至少一个定义模型输入的 JSONL 文件。一个包含 JSON 对象行的 JSONL。JSONL 文件必须以扩展名 .jsonl 结尾，并且为以下格式：

  ```
  { "recordId" : "alphanumeric string", "modelInput" : {JSON body} }
  ...
  ```

  每行都包含一个 JSON 对象，其中包含一个`recordId`字段和一个`modelInput`字段。`modelInput`JSON 对象的格式取决于您在[创建批量推理作业时选择的](batch-inference-create.md)模型调用类型。如果您使用`InvokeModel`类型（默认），则格式必须与您在`InvokeModel`请求中使用的模型的`body`字段相匹配（请参阅[基础模型的推理请求参数和响应字段](model-parameters.md)）。如果您使用`Converse`类型，则格式必须与 C [onverse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) API 的请求正文相匹配。
**注意**  
如果您遗漏了 `recordId` 字段，Amazon Bedrock 会将其添加到输出中。
输出 JSONL 文件中记录的顺序，无法保证与输入 JSONL 文件中记录的顺序相匹配。
您可在创建[批量推理作业](batch-inference-create.md)时指定要使用的模型。
+ （如果您的输入内容包含 Amazon S3 位置）某些模型允许您将输入内容定义为 S3 位置。请参阅[Amazon Nova 的视频输入示例](#batch-inference-data-ex-s3)。
**警告**  
在提示 URIs 中使用 S3 时，所有资源必须位于相同的 S3 存储桶和文件夹中。该`InputDataConfig`参数必须指定包含所有链接资源（例如视频或图像）的文件夹路径，而不仅仅是单个`.jsonl`文件。请注意，S3 路径区分大小写，因此请确保与文件夹结构完全 URIs 匹配。

确保您的输入符合批量推理配额。您可以在 [Amazon Bedrock 服务配额](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#limits_bedrock)中搜索以下配额：
+ **每个批量推理作业的最小记录数** – 作业中所有 JSONL 文件的最小记录（JSON 对象）数量。
+ **每个批量推理作业每个输入文件的记录数** – 作业中单个 JSONL 文件内的最大记录（JSON 对象）数量。
+ **每个批量推理作业的记录数** – 作业中所有 JSONL 文件的最大记录（JSON 对象）数量。
+ **批量推理输入文件大小** – 作业中单个文件的最大大小。
+ **批量推理作业大小** – 所有输入文件的最大累计大小。

要更好地了解如何设置批量推理输入，请参阅以下示例：

## Anthropic Claude 3 Haiku 的文本输入示例
<a name="batch-inference-data-ex-text"></a>

如果您计划使用 Anthropic Claude 3 Haiku 模型的[消息 API](model-parameters-anthropic-claude-messages.md) 格式运行批量推理，则可以提供包含以下 JSON 对象行的 JSONL 文件：

```
{
    "recordId": "CALL0000001", 
    "modelInput": {
        "anthropic_version": "bedrock-2023-05-31", 
        "max_tokens": 1024,
        "messages": [ 
            { 
                "role": "user", 
                "content": [
                    {
                        "type": "text", 
                        "text": "Summarize the following call transcript: ..." 
                    } 
                ]
            }
        ]
    }
}
```

## Amazon Nova 的视频输入示例
<a name="batch-inference-data-ex-s3"></a>

如果您计划使用 Amazon Nova Lite 或 Amazon Nova Pro 模型对视频输入进行批量推理，则可以选择在 JSONL 文件中以字节为单位定义视频，或者提供 S3 位置。例如，您可能有一个 S3 存储桶，其路径为 `s3://batch-inference-input-bucket`，其中包含以下文件：

```
s3://batch-inference-input-bucket/
├── videos/
│   ├── video1.mp4
│   ├── video2.mp4
│   ├── ...
│   └── video50.mp4
└── input.jsonl
```

`input.jsonl` 文件中的示例记录如下所示：

```
{
    "recordId": "RECORD01",
    "modelInput": {
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "text": "You are an expert in recipe videos. Describe this video in less than 200 words following these guidelines: ..."
                    },
                    {
                        "video": {
                            "format": "mp4",
                            "source": {
                                "s3Location": {
                                    "uri": "s3://batch-inference-input-bucket/videos/video1.mp4",
                                    "bucketOwner": "111122223333"
                                }
                            }
                        }
                    }
                ]
            }
        ]
    }
}
```

创建批量推理作业时，必须在`InputDataConfig`参数`s3://batch-inference-input-bucket`中指定文件夹路径。Batch Inference 将处理此位置`input.jsonl`的文件以及任何引用的资源（例如`videos`子文件夹中的视频文件）。

以下资源提供了有关提交视频输入进行批量推理的更多信息：
+ 要了解如何在输入请求 URIs 中验证 Amazon S3，请参阅 [Amazon S3 网址解析博客](https://aws.amazon.com/blogs/devops/s3-uri-parsing-is-now-available-in-aws-sdk-for-java-2-x/)。
+ 有关如何使用 Nova 设置调用记录以实现视频理解的更多信息，请参阅[Amazon Nova视觉提示指南](https://docs.aws.amazon.com/nova/latest/userguide/prompting-vision-prompting.html)。

## 匡威输入示例
<a name="batch-inference-data-ex-converse"></a>

如果您在创建批量推理作业`Converse`时将模型调用类型设置为，则该`modelInput`字段必须使用 Converse API 请求格式。以下示例显示了 Converse 批量推理作业的 JSONL 记录：

```
{
    "recordId": "CALL0000001",
    "modelInput": {
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "text": "Summarize the following call transcript: ..."
                    }
                ]
            }
        ],
        "inferenceConfig": {
            "maxTokens": 1024
        }
    }
}
```

有关 Converse 请求正文中支持的字段的完整列表，请参阅 API 参考中的 [Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html)。

以下主题介绍如何为某个身份设置 S3 访问权限和批量推理权限，使得该身份能够执行批量推理。

# 批量推理所需权限
<a name="batch-inference-permissions"></a>

要执行批量推理，您必须为以下 IAM 身份设置权限：
+ 用于创建和管理批量推理作业的 IAM 身份。
+ Amazon Bedrock 代入的批量推理[服务角色](security-iam-sr.md)，用于代表您执行操作。

要了解如何为每个身份设置权限，请浏览以下主题：

**Topics**
+ [IAM 身份提交和管理批量推理作业所需的权限](#batch-inference-permissions-user)
+ [服务角色执行批量推理所需的权限](#batch-inference-permissions-service)

## IAM 身份提交和管理批量推理作业所需的权限
<a name="batch-inference-permissions-user"></a>

要让 IAM 身份能够使用此功能，您必须为其配置必要的权限。为此，请执行以下操作之一：
+ 要允许身份执行所有 Amazon Bedrock 操作，请将[AmazonBedrockFullAccess](security-iam-awsmanpol.md#security-iam-awsmanpol-AmazonBedrockFullAccess)策略附加到该身份。如果您这样做，可跳过此主题。此选项安全性较低。
+ 作为安全最佳实践，您应仅向身份授予必要的操作权限。本主题介绍此功能所需的权限。

要将权限限制为仅用于批量推理相关的操作，请将以下基于身份的策略附加到 IAM 身份：

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "BatchInference",
            "Effect": "Allow",
            "Action": [  
                "bedrock:ListFoundationModels",
                "bedrock:GetFoundationModel",
                "bedrock:ListInferenceProfiles",
                "bedrock:GetInferenceProfile",
                "bedrock:ListCustomModels",
                "bedrock:GetCustomModel",
                "bedrock:TagResource", 
                "bedrock:UntagResource", 
                "bedrock:ListTagsForResource",
                "bedrock:CreateModelInvocationJob",
                "bedrock:GetModelInvocationJob",
                "bedrock:ListModelInvocationJobs",
                "bedrock:StopModelInvocationJob"
            ],
            "Resource": "*"
        }
    ]   
}
```

------

要进一步限制权限，您可以忽略操作，或者指定用于筛选权限的资源和条件键。有关操作、资源和条件键的更多信息，请参阅《服务授权参考》**中的以下主题：
+ [Amazon Bedrock 定义的操作](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-actions-as-permissions) – 了解操作、您可以在 `Resource` 字段中限定范围的资源类型，以及 `Condition` 字段中可用于筛选权限的条件键。
+ [Amazon Bedrock 定义的资源类型](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-resources-for-iam-policies) – 了解 Amazon Bedrock 中的资源类型。
+ [Amazon Bedrock 的条件键](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-policy-keys) – 了解 Amazon Bedrock 中的条件键。

以下策略示例限定了批量处理的权限范围，仅允许账户 ID 为 `123456789012` 的用户在 `us-west-2` 区域中创建批量推理作业，并使用 Anthropic Claude 3 Haiku 模型：

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "CreateBatchInferenceJob",
            "Effect": "Allow",
            "Action": [
                "bedrock:CreateModelInvocationJob"
            ],
            "Resource": [
                "arn:aws:bedrock:us-west-2::foundation-model/anthropic.claude-3-haiku-20240307-v1:0",
                "arn:aws:bedrock:us-west-2:123456789012:model-invocation-job/*"
            ]
        }
    ]
}
```

------

## 服务角色执行批量推理所需的权限
<a name="batch-inference-permissions-service"></a>

由代入您的身份并代表您执行操作的[服务角色](security-iam-sr.md)执行批量推理。您可以通过以下方式创建服务角色：
+ 使用 AWS 管理控制台，让 Amazon Bedrock 自动为您创建具有必要权限的服务角色。您可以在创建批量推理作业时选择此选项。
+ 使用AWS Identity and Access Management并附加必要的权限，为 Amazon Bedrock 创建自定义服务角色。提交批量推理作业时，您需要指定此角色。有关创建用于批量推理的自定义服务角色的更多信息，请参阅[为批量推理创建自定义服务角色](batch-iam-sr.md)。有关创建服务角色的更多常规信息，请参阅《IAM 用户指南》中的[创建角色以委派权限给 AWS 服务](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-service.html)。

**重要**  
如果您在其中[上传数据以进行批量推断](batch-inference-data.md)的 S3 存储桶位于不同的存储桶中AWS 账户，则必须配置 S3 存储桶策略以允许服务角色访问数据。即使您使用控制台自动创建服务角色，仍必须手动配置此策略。要了解如何为 Amazon Bedrock 资源配置 S3 存储桶策略，请参阅[将存储桶策略附加到 Amazon S3 存储桶以允许其他账户访问它](s3-bucket-access.md#s3-bucket-access-cross-account)。
Amazon Bedrock 中的基础模型是AWS托管资源，不能用于需要客户所有权的 IAM 策略条件。这些模型由个人客户拥有和运营AWS，不能归个人客户所有。任何检查客户拥有的资源的 IAM 政策条件（例如使用资源标签、组织 ID 或其他所有权属性的条件）在应用于基础模型时都将失败，这可能会阻止对这些服务的合法访问。  
例如，如果您的保单包含这样的`aws:ResourceOrgID`条件：  

  ```
  {
    "Condition": {
      "StringEqualsIgnoreCase": {
        "aws:ResourceOrgID": ["o-xxxxxxxx"]
      }
    }
  }
  ```
您的批量推理作业将失败。`AccessDeniedException`删除`aws:ResourceOrgID`条件或为基础模型创建单独的策略声明。

# 使用 VPC 保护批量推理作业
<a name="batch-vpc"></a>

当您运行批量推理作业时，该作业会访问您的 Amazon S3 存储桶以下载输入数据并写入输出数据。要控制对数据的访问，我们建议您使用通过 [Amazon VPC](https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html) 创建的虚拟私有云（VPC）。您可以对 VPC 进行配置，确保数据无法通过互联网访问，然后使用 [AWS PrivateLink](https://docs.aws.amazon.com/vpc/latest/privatelink/what-is-privatelink.html) 创建 VPC 接口端点以建立与数据的私有连接，从而进一步保护您的数据。有关 Amazon VPC 和如何与 Amazon Bedrock AWS PrivateLink 集成的更多信息，请参阅[使用 Amazon VPC 和 AWS PrivateLink 保护您的数据](usingVPC.md)。

按照以下步骤配置和使用 VPC，以便保护批量推理作业的输入提示和输出模型响应。

**Topics**
+ [设置 VPC 以在批量推理过程中保护您的数据](#batch-vpc-setup)
+ [将 VPC 权限附加到批量推理角色](#batch-vpc-role)
+ [在提交批量推理作业时添加 VPC 配置](#batch-vpc-config)

## 设置 VPC 以在批量推理过程中保护您的数据
<a name="batch-vpc-setup"></a>

要设置 VPC，请按照[设置 VPC](usingVPC.md#create-vpc)中的步骤操作。您可以按照[（示例）使用 VPC 限制对 Amazon S3 数据的访问](vpc-s3.md)中的步骤设置 S3 VPC 端点，并使用基于资源的 IAM 策略来限制对包含批量推理数据的 S3 存储桶的访问，从而进一步保护您的 VPC。

## 将 VPC 权限附加到批量推理角色
<a name="batch-vpc-role"></a>

完成 VPC 设置后，将以下权限附加到您的[批量推理服务角色](batch-iam-sr.md)，以便允许其访问该 VPC。修改此策略，以仅允许访问作业所需的 VPC 资源。将*subnet-ids*和*security-group-id*替换为您的 VPC 中的值。

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "1",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeNetworkInterfaces",
                "ec2:DescribeVpcs",
                "ec2:DescribeDhcpOptions",
                "ec2:DescribeSubnets",
                "ec2:DescribeSecurityGroups"
            ],
            "Resource": [
                "*"
            ]
        },
        {
            "Sid": "2",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateNetworkInterface"
            ],
            "Resource": [
                "arn:aws:ec2:us-east-1:123456789012:network-interface/*",
                "arn:aws:ec2:us-east-1:123456789012:subnet/${{subnet-id}}",
                "arn:aws:ec2:us-east-1:123456789012:security-group/${{security-group-id}}"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:RequestTag/BedrockManaged": [
                        "true"
                    ]
                },
                "ArnEquals": {
                    "aws:RequestTag/BedrockModelInvocationJobArn": [
                        "arn:aws:bedrock:us-east-1:123456789012:model-invocation-job/*"
                    ]
                }
            }
        },
        {
            "Sid": "3",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateNetworkInterfacePermission",
                "ec2:DeleteNetworkInterface",
                "ec2:DeleteNetworkInterfacePermission"
            ],
            "Resource": [
                "*"
            ],
            "Condition": {
                "StringEquals": {
                    "ec2:Subnet": [
                        "arn:aws:ec2:us-east-1:123456789012:subnet/${{subnet-id}}"
                    ]
                },
                "ArnEquals": {
                    "ec2:ResourceTag/BedrockModelInvocationJobArn": [
                        "arn:aws:bedrock:us-east-1:123456789012:model-invocation-job/*"
                    ]
                }
            }
        },
        {
            "Sid": "4",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateTags"
            ],
            "Resource": "arn:aws:ec2:us-east-1:123456789012:network-interface/*",
            "Condition": {
                "StringEquals": {
                    "ec2:CreateAction": [
                        "CreateNetworkInterface"
                    ]
                },
                "ForAllValues:StringEquals": {
                    "aws:TagKeys": [
                        "BedrockManaged",
                        "BedrockModelInvocationJobArn"
                    ]
                }
            }
        }
    ]
}
```

------

## 在提交批量推理作业时添加 VPC 配置
<a name="batch-vpc-config"></a>

按照前几节所述配置 VPC 以及所需的角色和权限后，您可以创建使用此 VPC 的批量推理作业。

**注意**  
目前，在创建批量推理作业时，您只能通过 API 使用 VPC。

当您为任务指定 VPC 子网和安全组时，Amazon Bedrock 会在其中一个子*网中创建与您的安全组关联的弹性网络接口* (ENIs)。 ENIs 允许 Amazon Bedrock 任务连接到您的 VPC 中的资源。有关信息 ENIs，请参阅 *Amazon VPC 用户指南*中的[弹性网络接口](https://docs.aws.amazon.com/vpc/latest/userguide/VPC_ElasticNetworkInterfaces.html)。它创建的 Amazon Bedrock 标签 ENIs `BedrockManaged`和`BedrockModelInvocationJobArn`标签。

我们建议您至少在每个可用区中提供一个子网。

您可以使用安全组来制定规则，以控制 Amazon Bedrock 对您的 VPC 资源的访问。

提交请求时，您可以将`VpcConfig`作为[CreateModelInvocationJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateModelInvocationJob.html)请求参数包括在内，以指定要使用的 VPC 子网和安全组，如以下示例所示。

```
"vpcConfig": { 
    "securityGroupIds": [
        "sg-0123456789abcdef0"
    ],
    "subnets": [
        "subnet-0123456789abcdef0",
        "subnet-0123456789abcdef1",
        "subnet-0123456789abcdef2"
    ]
}
```

# 创建批量推理作业
<a name="batch-inference-create"></a>

设置好包含用于运行模型推理的文件的 Amazon S3 存储桶后，您可以创建批量推理作业。在开始之前，请确认您已按照[设置格式并上传批量推理数据](batch-inference-data.md)中的说明设置了文件。

**注意**  
要使用 VPC 提交批量推理作业，必须使用 API。选择“API”选项卡，了解如何包含 VPC 配置。

要了解如何创建批量推理作业，请选择与您的首选方法对应的选项卡，然后按照以下步骤操作：

------
#### [ Console ]

**创建批量推理作业**

1. 使用有权使用 Amazon Bedrock 控制台的 IAM 身份登录。 AWS 管理控制台 然后，在 [https://console.aws.amazon.com/](https://console.aws.amazon.com/bedrock)bedrock 上打开 Amazon Bedrock 控制台。

1. 从左侧导航窗格中选择**批量推理**。

1. 在**批量推理作业**部分，选择**创建作业**。

1. 在**作业详细信息**部分，为批量推理作业指定一个**作业名称**，然后通过**选择模型**来选择用于批量推理作业的模型。

1. 在**模型调用类型**部分，为您的输入数据选择 API 格式。选择您的输入数据**InvokeModel**是否使用特定于模型的请求格式，或者如果您的输入数据使用 **Converse API** 格式，则选择 Converse。默认值为 **InvokeModel**。

1. 在**输入数据**部分，选择**浏览 S3**，然后为您的批量推理作业选择 S3 位置。批量推理处理该 S3 位置的所有 JSONL 以及随附的内容文件，无论该位置是 S3 文件夹还是单个 JSONL 文件。
**注意**  
如果输入数据所在的 S3 存储桶所属的账户不是您提交作业的账户，您必须使用 API 提交批量推理作业。要了解如何执行相应操作，请选择上方的“API”选项卡。

1. 在**输出数据**部分，选择**浏览 S3**，然后选择一个 S3 位置来存储批处理推理作业的输出文件。默认情况下，输出数据将由加密 AWS 托管式密钥。要选择自定义 KMS 密钥，请选择**自定义加密设置（高级）**，然后选择一个密钥。有关加密 Amazon Bedrock 资源和设置自定义 KMS 密钥的更多信息，请参阅[数据加密](data-encryption.md)。
**注意**  
如果您计划将输出数据写入一个 S3 存储桶，但该存储桶所属的账户不是您提交作业的账户，您必须使用 API 提交批量推理作业。要了解如何执行相应操作，请选择上方的“API”选项卡。

1. 在**服务访问权限**部分，选择以下选项之一：
   + **使用现有服务角色** — 从下拉列表中选择一个服务角色。有关设置具有相应权限的自定义角色的更多信息，请参阅[批量推理所需权限](batch-inference-permissions.md)。
   + **创建和使用新的服务角色** — 输入服务角色的名称。

1. （可选）要将标签与批量推理作业关联，请展开**标签**部分，为每个标签添加键和可选值。有关更多信息，请参阅 [标记 Amazon Bedrock 资源](tagging.md)。

1. 选择**创建批量推理作业**。

------
#### [ API ]

要创建批量推理作业，请使用 [Amazon Bedrock 控制平面](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)终端节点发送[CreateModelInvocationJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateModelInvocationJob.html)请求。

以下字段是必填字段：


****  

| 字段 | 使用案例 | 
| --- | --- | 
| jobName | 指定作业名称。 | 
| roleArn | 指定有权创建和管理作业的服务角色的 Amazon 资源名称（ARN）。有关更多信息，请参阅 [为批量推理创建自定义服务角色](batch-iam-sr.md)。 | 
| modelId | 指定要在推理中使用的模型的 ID 或 ARN。 | 
| inputDataConfig | 用于指定包含输入数据的 S3 位置。批量推理处理该 S3 位置的所有 JSONL 以及随附的内容文件，无论该位置是 S3 文件夹还是单个 JSONL 文件。有关更多信息，请参阅 [设置格式并上传批量推理数据](batch-inference-data.md)。 | 
| outputDataConfig | 指定将模型响应写入的 S3 位置。 | 

以下字段是可选字段：


****  

| 字段 | 使用案例 | 
| --- | --- | 
| modelInvocationType | 指定输入数据的 API 格式。设置Converse为使用 Converse API 格式，或者InvokeModel（默认）使用特定于模型的请求格式。有关 Converse 请求格式的更多信息，请参阅[匡威](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html)。 | 
| timeoutDurationIn小时 | 指定作业超时的小时数。 | 
| 标签 | 指定要与作业关联的所有标签。有关更多信息，请参阅 [标记 Amazon Bedrock 资源](tagging.md)。 | 
| vpcConfig | 指定用于在作业期间保护数据的 VPC 配置。有关更多信息，请参阅 [使用 VPC 保护批量推理作业](batch-vpc.md)。 | 
| clientRequestToken | 确保 API 请求仅完成一次。有关更多信息，请参阅[确保幂等性](https://docs.aws.amazon.com/ec2/latest/devguide/ec2-api-idempotency.html)。 | 

响应会返回 `jobArn`，您可以使用它在执行其他与批量推理相关的 API 调用时引用该作业。

------

# 监控批量推理作业
<a name="batch-inference-monitor"></a>

除了为批量推理作业设置的配置外，您还可以通过查看作业状态来监控其进度。有关任务可能状态的更多信息，请参阅中的`status`字段。[ModelInvocationJobSummary](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ModelInvocationJobSummary.html)

您还可以通过比较记录总数和已处理的记录数来跟踪作业的状态。这些数字可以在包含输出文件的 Amazon S3 存储桶中的 `manifest.json.out` 文件中找到。有关更多信息，请参阅 [查看批量推理作业的结果](batch-inference-results.md)。要了解如何下载 S3 对象，请参阅[下载对象](https://docs.aws.amazon.com/AmazonS3/latest/userguide/download-objects.html)。

**提示**  
当批量推理任务完成或状态更改时，您可以使用 Amazon EventBridge 自动接收通知，而不必轮询任务状态。有关更多信息，请参阅 [使用亚马逊监控 Amazon Bedrock 作业状态的变化 EventBridge监控事件变化](monitoring-eventbridge.md)。

要了解如何查看有关批量推理作业的详细信息，请选择与您的首选方法对应的选项卡，然后按照以下步骤操作：

------
#### [ Console ]

**查看关于批量推理作业的信息**

1. 使用有权使用 Amazon Bedrock 控制台的 IAM 身份登录。 AWS 管理控制台 然后，在 [https://console.aws.amazon.com/](https://console.aws.amazon.com/bedrock)bedrock 上打开 Amazon Bedrock 控制台。

1. 从左侧导航窗格中选择**批量推理**。

1. 在**批量推理作业**部分，选择一个作业。

1. 在作业详细信息页面上，您可以查看关于作业配置的信息，并通过查看**状态**来监控其进度。

------
#### [ API ]

要获取有关批量推理任务的信息，请使用 [Amazon Bedrock 控制平面终端节点](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)发送[GetModelInvocationJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetModelInvocationJob.html)请求，并在现场提供该任务的 ID 或 ARN。`jobIdentifier`

要列出有关多个批量推理任务的信息，请使用 [Amazon Bedrock 控制平面](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)终端节点发送[ListModelInvocationJobs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListModelInvocationJobs.html)请求。您可以指定以下可选参数：

`GetModelInvocationJob`和的响应`ListModelInvocationJobs`包含一个`modelInvocationType`字段，用于指示任务是否使用`InvokeModel`或 `Converse` API 格式。


****  

| 字段 | 简短描述 | 
| --- | --- | 
| maxResults | 要在响应中返回的结果数量上限。 | 
| nextToken | 如果结果的数量多于您在 maxResults 字段中指定的数量，响应会返回一个 nextToken 值。要查看下一批结果，请在另一个请求中发送 nextToken 值。 | 

要列出任务的所有标签，请使用 [Amazon Bedrock 控制平面终端节点](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)发送[ListTagsForResource](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListTagsForResource.html)请求，并附上该任务的亚马逊资源名称 (ARN)。

------

# 停止批量推理作业
<a name="batch-inference-stop"></a>

要了解如何停止批量推理作业，请选择与您的首选方法对应的选项卡，然后按照以下步骤操作：

------
#### [ Console ]

**停止批量推理作业**

1. 采用有权使用 Amazon Bedrock 控制台的 IAM 身份登录 AWS 管理控制台。然后，通过以下网址打开 Amazon Bedrock 控制台：[https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock)。

1. 从左侧导航窗格中选择**批量推理**。

1. 选择一个作业以进入作业详细信息页面，或选择作业旁边的选项按钮。

1. 选择**停止作业**。

1. 查看消息，然后选择**停止作业**进行确认。
**注意**  
您需要为已经处理的令牌付费。

------
#### [ API ]

要停止批量推理作业，请使用 [Amazon Bedrock 控制面板端点](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)发送 [StopModelInvocationJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_StopModelInvocationJob.html) 请求，并在 `jobIdentifier` 字段中提供作业 ID 或 ARN。

如果作业成功停止，您将收到 HTTP 200 响应。

------

# 查看批量推理作业的结果
<a name="batch-inference-results"></a>

批量推理作业的状态变为 `Completed` 后，您可以从在创建作业时指定的 Amazon S3 存储桶中的文件中提取批量推理作业的结果。要了解如何下载 S3 对象，请参阅[下载对象](https://docs.aws.amazon.com/AmazonS3/latest/userguide/download-objects.html)。S3 存储桶包含以下文件：

1. Amazon Bedrock 会为每个输入 JSONL 文件生成一个输出 JSONL 文件。输出文件包含模型针对每个输入的输出，格式如下。在推理出现错误的任何行中，`error` 对象都会替换 `modelOutput` 字段。`modelOutput`JSON 对象的格式取决于模型调用类型。对于`InvokeModel`作业，格式与`InvokeModel`响应中的`body`字段相匹配（请参阅[基础模型的推理请求参数和响应字段](model-parameters.md)）。对于`Converse`作业，格式与 C [onvers](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) e API 的响应正文相匹配。

   ```
   { "recordId" : "string", "modelInput": {JSON body}, "modelOutput": {JSON body} }
   ```

   以下示例显示了可能的输出文件。

   ```
   { "recordId" : "3223593EFGH", "modelInput" : {"inputText": "Roses are red, violets are"}, "modelOutput" : {"inputTextTokenCount": 8, "results": [{"tokenCount": 3, "outputText": "blue\n", "completionReason": "FINISH"}]}}
   { "recordId" : "1223213ABCD", "modelInput" : {"inputText": "Hello world"}, "error" : {"errorCode" : 400, "errorMessage" : "bad request" }}
   ```

1. 包含批量推理作业摘要的 `manifest.json.out` 文件。

   ```
   {
       "totalRecordCount" : number, 
       "processedRecordCount" : number,
       "successRecordCount": number,
       "errorRecordCount": number,
       "inputTokenCount": number,
       "outputTokenCount" : number
   }
   ```

   这些字段如下所述：
   + totalRecordCount — 提交给批量推理作业的记录总数。
   + processedRecordCount — 批量推理作业中处理的记录数。
   + successRecordCount — 批量推理作业成功处理的记录数。
   + errorRecordCount — 批量推理作业中导致错误的记录数。
   + inputTokenCount — 提交给批量推理作业的输入令牌总数。
   + outputTokenCount — 批量推理作业生成的输出令牌总数。

# 批量推理的代码示例
<a name="batch-inference-example"></a>

本章中的代码示例展示了如何创建批量推理作业、如何查看关于该作业的信息以及如何停止该作业。此示例使用 `InvokeModel` API 格式。有关使用 `Converse` API 格式的信息，请参阅[设置格式并上传批量推理数据](batch-inference-data.md)。

选择一种语言，以查看相应的代码示例：

------
#### [ Python ]

创建一个名*abc.jsonl*为的 JSONL 文件，并为包含至少最少记录数的每条记录添加一个 JSON 对象（请参阅每个**批处理推理作业的最小记录数**）。*\$1Model\$1* [Amazon Bedrock 的配额](quotas.md)在本示例中，您将使用 Anthropic Claude 3 Haiku 模型。以下示例显示了文件中的第一个输入 JSON：

```
{
    "recordId": "CALL0000001", 
    "modelInput": {
        "anthropic_version": "bedrock-2023-05-31", 
        "max_tokens": 1024,
        "messages": [ 
            { 
                "role": "user", 
                "content": [
                    {
                        "type": "text", 
                        "text": "Summarize the following call transcript: ..." 
                    } 
                ]
            }
        ]
    }
}
... 
# Add records until you hit the minimum
```

创建名为的 S3 存储桶*amzn-s3-demo-bucket-input*，并将文件上传到该存储桶。然后创建一个名为的 S3 存储桶*amzn-s3-demo-bucket-output*，用于将输出文件写入其中。运行以下代码片段提交任务并*jobArn*从响应中获取：

```
import boto3

bedrock = boto3.client(service_name="bedrock")

inputDataConfig=({
    "s3InputDataConfig": {
        "s3Uri": "s3://amzn-s3-demo-bucket-input/abc.jsonl"
    }
})

outputDataConfig=({
    "s3OutputDataConfig": {
        "s3Uri": "s3://amzn-s3-demo-bucket-output/"
    }
})

response=bedrock.create_model_invocation_job(
    roleArn="arn:aws:iam::123456789012:role/MyBatchInferenceRole",
    modelId="anthropic.claude-3-haiku-20240307-v1:0",
    jobName="my-batch-job",
    inputDataConfig=inputDataConfig,
    outputDataConfig=outputDataConfig
)

jobArn = response.get('jobArn')
```

返回作业的 `status`。

```
bedrock.get_model_invocation_job(jobIdentifier=jobArn)['status']
```

列出批量推理作业. *Failed* 

```
bedrock.list_model_invocation_jobs(
    maxResults=10,
    statusEquals="Failed",
    sortOrder="Descending"
)
```

停止已开始的作业。

```
bedrock.stop_model_invocation_job(jobIdentifier=jobArn)
```

------

# 使用 OpenAI 批处理 API 批量提交提示
<a name="inference-openai-batch"></a>

您可以将 [OpenAI 创建批处理 API](https://platform.openai.com/docs/api-reference/batch) 与 Amazon Bedrock OpenAI 模型结合使用，来运行批量推理作业。

您可通过以下方式调用 OpenAI 创建批处理 API：
+ 通过 Amazon Bedrock 运行时端点发出 HTTP 请求。
+ 将 OpenAI SDK 请求和 Amazon Bedrock 运行时端点配合使用。

选择一个主题以了解更多信息：

**Topics**
+ [支持 OpenAI 批处理 API 的模型和区域](#inference-openai-batch-supported)
+ [使用 OpenAI 批处理 API 的先决条件](#inference-openai-batch-prereq)
+ [创建 OpenAI 批处理作业](#inference-openai-batch-create)
+ [检索 OpenAI 批处理作业](#inference-openai-batch-retrieve)
+ [列出 OpenAI 批处理作业](#inference-openai-batch-list)
+ [取消 OpenAI 批处理作业](#inference-openai-batch-cancel)

## 支持 OpenAI 批处理 API 的模型和区域
<a name="inference-openai-batch-supported"></a>

您可以对 Amazon Bedrock 和支持这些OpenAI模型的 AWS 地区支持的所有型号使用OpenAI创建批处理 API。有关支持的模型和区域的更多信息，请参阅 [Amazon Bedrock 中支持的根基模型](models-supported.md)。

## 使用 OpenAI 批处理 API 的先决条件
<a name="inference-openai-batch-prereq"></a>

要查看使用 OpenAI 批处理 API 操作的先决条件，请选择与您的首选方法对应的选项卡，然后按照以下步骤操作：

------
#### [ OpenAI SDK ]
+ **身份验证** – OpenAI SDK 仅支持使用 Amazon Bedrock API 密钥进行身份验证。生成 Amazon Bedrock API 密钥以对您的请求进行身份验证。要了解有关 Amazon Bedrock API 密钥以及如何生成密钥的信息，请参阅 “构建” 一章中的 API 密钥部分。
+ **终端节点** — 找到与要在 [Amazon Bedrock 运行时终端节点和配额中使用的 AWS 区域相对应的终端节点](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-rt)。如果您使用 S AWS DK，则在设置客户端时可能只需要指定区域代码，而无需指定整个端点。
+ **模型访问权限** – 请求访问支持此功能的 Amazon Bedrock 模型。有关更多信息，请参阅[使用 SDK 和 CLI 管理模型访问权限](model-access.md#model-access-modify)。
+ **安装 OpenAI SDK** – 有关更多信息，请参阅 OpenAI 文档中的[库](https://platform.openai.com/docs/libraries)。
+ **上传到 S3 的批处理 JSONL 文件** – 按照 OpenAI 文档中[准备批处理文件](https://platform.openai.com/docs/guides/batch#1-prepare-your-batch-file)中的步骤，采用恰当的格式准备批处理文件。然后，将其上传到 Amazon S3 存储桶。
+ **IAM 权限** – 确保您拥有具适当权限的以下 IAM 身份：
  + 通过身份验证的 IAM 身份，该身份可以执行与批量推理相关的 API 操作。有关更多信息，请参阅[IAM 身份提交和管理批量推理作业所需的权限](batch-inference-permissions.md)。
  + 您使用的批量推理服务角色可以代入您的身份，调用您使用的 OpenAI 模型，以及访问 S3 中的批处理 JSONL 文件。有关更多信息，请参阅 [服务角色](security-iam-sr.md)。

------
#### [ HTTP request ]
+ **身份验证** — 您可以使用您的 AWS 证书或 Amazon Bedrock API 密钥进行身份验证。

  设置您的 AWS 凭证或生成 Amazon Bedrock API 密钥来验证您的请求。
  + 要了解如何设置 AWS 证书，请参阅使用[AWS 安全凭证进行编程访问](https://docs.aws.amazon.com/IAM/latest/UserGuide/security-creds-programmatic-access.html)。
  + 要了解有关 Amazon Bedrock API 密钥以及如何生成密钥的信息，请参阅 “构建” 一章中的 API 密钥部分。
+ **终端节点** — 找到与要在 [Amazon Bedrock 运行时终端节点和配额中使用的 AWS 区域相对应的终端节点](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-rt)。如果您使用 S AWS DK，则在设置客户端时可能只需要指定区域代码，而无需指定整个端点。
+ **模型访问权限** – 请求访问支持此功能的 Amazon Bedrock 模型。有关更多信息，请参阅[使用 SDK 和 CLI 管理模型访问权限](model-access.md#model-access-modify)。
+ **上传到 S3 的批处理 JSONL 文件** – 按照 OpenAI 文档中[准备批处理文件](https://platform.openai.com/docs/guides/batch#1-prepare-your-batch-file)中的步骤，采用恰当的格式准备批处理文件。然后，将其上传到 Amazon S3 存储桶。
+ **IAM 权限** – 确保您拥有具适当权限的以下 IAM 身份：
  + 通过身份验证的 IAM 身份，该身份可以执行与批量推理相关的 API 操作。有关更多信息，请参阅[IAM 身份提交和管理批量推理作业所需的权限](batch-inference-permissions.md)。
  + 您使用的批量推理服务角色可以代入您的身份，调用您使用的 OpenAI 模型，以及访问 S3 中的批处理 JSONL 文件。有关更多信息，请参阅[服务角色](security-iam-sr.md)。

------

## 创建 OpenAI 批处理作业
<a name="inference-openai-batch-create"></a>

有关 OpenAI 创建批处理 API 的详细信息，请参阅 OpenAI 文档中的以下资源：
+ [创建批处理](https://platform.openai.com/docs/api-reference/batch/create) – 详细说明请求和响应。
+ [请求输出对象](https://platform.openai.com/docs/api-reference/batch/request-output) – 详细说明批处理作业所生成的输出的字段。如需解释 S3 存储桶中的结果，请参阅此文档。

**构造请求**  
在构造批量推理请求时，请注意以下特定于 Amazon Bedrock 的字段和值：

**请求标头**
+ X-Amzn-Bedrock-RoleArn （必填）— 批量推理服务角色的亚马逊资源名称 (ARN)。有关更多信息，请参阅 [为批量推理创建自定义服务角色](batch-iam-sr.md)。
+ X-Amzn-Bedrock-ModelId （必填）-用于推理的基础模型的 ID。有关更多信息，请参阅 [Amazon Bedrock 中支持的根基模型](models-supported.md)。
+ X-Amzn-Bedrock-OutputEncryptionKeyId （可选）— 要用于加密输出 S3 文件的 KMS 密钥的 ID。有关更多信息，请参阅[使用 AWS KMS (SSE-KMS) 指定服务器端加密](https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-kms-encryption.html)。
+ X-Amzn-Bedrock-Tags （可选）-键和值字典，用于指示要附加到输出的标签。有关更多信息，请参阅 [标记 Amazon Bedrock 资源](tagging.md)。

**请求正文参数：**
+ 端点 – 必须是 `v1/chat/completions`。
+ input\$1file\$1id – 指定批处理 JSONL 文件的 S3 URI。

**查找生成的结果**  
创建的响应中包含批处理 ID。批量推理作业的结果和错误日志记录将写入包含输入文件的 S3 文件夹。结果位于与批处理 ID 同名的文件夹中，如以下文件夹结构所示：

```
---- {batch_input_folder}
        |---- {batch_input}.jsonl
        |---- {batch_id}
	           |---- {batch_input}.jsonl.out
	           |---- {batch_input}.jsonl.err
```

要查看结合不同方法使用 OpenAI 创建批处理 API 的示例，请选择与您的首选方法对应的选项卡，然后按照以下步骤操作：

------
#### [ OpenAI SDK (Python) ]

要使用 OpenAI SDK 创建批处理作业，请执行以下操作：

1. 导入 OpenAI SDK 并使用以下字段设置客户端：
   + `base_url` – 将 Amazon Bedrock 运行时端点添加为 `/openai/v1` 的前缀，格式如下：

     ```
     https://${bedrock-runtime-endpoint}/openai/v1
     ```
   + `api_key` – 指定 Amazon Bedrock API 密钥。
   + `default_headers` – 如果需要包含任何标头，可将其作为键值对包含在此对象中。也可在进行特定 API 调用时在 `extra_headers` 中指定标头。

1. 在客户端上使用 [batches.create()](https://platform.openai.com/docs/api-reference/batch/create) 方法。

在运行以下示例之前，请先替换以下字段中的占位符：
+ api\$1key — 替换*\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK*为实际的 API 密钥。
+ X-Amzn-BedrockRoleArn — *arn:aws:iam::123456789012:role/BatchServiceRole* 替换为您设置的实际批量推理服务角色。
+ input\$1file\$1id — *s3://amzn-s3-demo-bucket/openai-input.jsonl* 替换为将批处理 JSONL 文件上传到的实际 S3 URI。

以下示例在 `us-west-2` 中调用 OpenAI 创建批处理作业 API，并包含一段元数据。

```
from openai import OpenAI

client = OpenAI(
    base_url="https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1", 
    api_key="$AWS_BEARER_TOKEN_BEDROCK", # Replace with actual API key
    default_headers={
        "X-Amzn-Bedrock-RoleArn": "arn:aws:iam::123456789012:role/BatchServiceRole" # Replace with actual service role ARN
    }
)

job = client.batches.create(
    input_file_id="s3://amzn-s3-demo-bucket/openai-input.jsonl", # Replace with actual S3 URI
    endpoint="/v1/chat/completions",
    completion_window="24h",
    metadata={
        "description": "test input"
    },
    extra_headers={
        "X-Amzn-Bedrock-ModelId": "openai.gpt-oss-20b-1:0",
    }
)
print(job)
```

------
#### [ HTTP request ]

要通过直接 HTTP 请求创建聊天补全作业，请执行以下操作：

1. 使用 POST 方法，并通过将 Amazon Bedrock 运行时端点添加为 `/openai/v1/batches` 的前缀来指定 URL，格式如下：

   ```
   https://${bedrock-runtime-endpoint}/openai/v1/batches
   ```

1. 在`Authorization`标题中指定您的 AWS 证书或 Amazon Bedrock API 密钥。

在运行以下示例之前，请先替换以下字段中的占位符：
+ 授权-*\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* 替换为实际的 API 密钥。
+ X-Amzn-BedrockRoleArn — *arn:aws:iam::123456789012:role/BatchServiceRole* 替换为您设置的实际批量推理服务角色。
+ input\$1file\$1id — *s3://amzn-s3-demo-bucket/openai-input.jsonl* 替换为将批处理 JSONL 文件上传到的实际 S3 URI。

以下示例在 `us-west-2` 中调用创建聊天补全 API，并包含一段元数据：

```
curl -X POST 'https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1/batches' \
    -H 'Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK' \  
    -H 'Content-Type: application/json' \
    -H 'X-Amzn-Bedrock-ModelId: openai.gpt-oss-20b-1:0' \
    -H 'X-Amzn-Bedrock-RoleArn: arn:aws:iam::123456789012:role/BatchServiceRole' \  
    -d '{    
    "input_file_id": "s3://amzn-s3-demo-bucket/openai-input.jsonl",    
    "endpoint": "/v1/chat/completions",    
    "completion_window": "24h",
    "metadata": {"description": "test input"}  
}'
```

------

## 检索 OpenAI 批处理作业
<a name="inference-openai-batch-retrieve"></a>

有关 OpenAI 检索批处理 API 请求和响应的详细信息，请参阅[检索批处理](https://platform.openai.com/docs/api-reference/batch/retrieve)。

在发出请求时，您需要指定要获取其信息的批处理作业的 ID。响应会返回有关批处理作业的信息，包括您可以在 S3 存储桶中查找的输出和错误文件名。

要查看运用不同方法使用 OpenAI 检索批处理 API 的示例，请选择与您的首选方法对应的选项卡，然后按照以下步骤操作：

------
#### [ OpenAI SDK (Python) ]

要使用 OpenAI SDK 检索批处理作业，请执行以下操作：

1. 导入 OpenAI SDK 并使用以下字段设置客户端：
   + `base_url` – 将 Amazon Bedrock 运行时端点添加为 `/openai/v1` 的前缀，格式如下：

     ```
     https://${bedrock-runtime-endpoint}/openai/v1
     ```
   + `api_key` – 指定 Amazon Bedrock API 密钥。
   + `default_headers` – 如果需要包含任何标头，可将其作为键值对包含在此对象中。也可在进行特定 API 调用时在 `extra_headers` 中指定标头。

1. 在客户端上使用 [batches.retrieve()](https://platform.openai.com/docs/api-reference/batch/create) 方法，并指定要检索其信息的批处理作业的 ID。

在运行以下示例之前，请先替换以下字段中的占位符：
+ api\$1key — 替换*\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK*为实际的 API 密钥。
+ batch\$1id — 替换为实际的 *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* API 密钥。

该示例在 ID 为的批处理作业中`us-west-2`调用OpenAI检索批处理作业 API *batch\$1abc123*。

```
from openai import OpenAI

client = OpenAI(
    base_url="https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1", 
    api_key="$AWS_BEARER_TOKEN_BEDROCK" # Replace with actual API key
)

job = client.batches.retrieve(batch_id="batch_abc123") # Replace with actual ID

print(job)
```

------
#### [ HTTP request ]

要通过直接 HTTP 请求检索批处理作业，请执行以下操作：

1. 使用 GET 方法，并通过将 Amazon Bedrock 运行时端点添加为 `/openai/v1/batches/${batch_id}` 的前缀来指定 URL，格式如下：

   ```
   https://${bedrock-runtime-endpoint}/openai/v1/batches/batch_abc123
   ```

1. 在`Authorization`标题中指定您的 AWS 证书或 Amazon Bedrock API 密钥。

在运行以下示例之前，请先替换以下字段中的占位符：
+ 授权-*\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* 替换为实际的 API 密钥。
+ batch\$1abc123 – 在路径中，将此值替换为批处理作业的实际 ID。

以下示例在 ID 为的批处理作业中`us-west-2`调用OpenAI检索批处理 API *batch\$1abc123*。

```
curl -X GET 'https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1/batches/batch_abc123' \
    -H 'Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK'
```

------

## 列出 OpenAI 批处理作业
<a name="inference-openai-batch-list"></a>

有关 OpenAI 列出批处理 API 请求和响应的详细信息，请参阅[列出批处理](https://platform.openai.com/docs/api-reference/batch/list)。响应会返回有关您的批处理作业的一组信息。

在发出请求时，您可以包含查询参数以筛选结果。响应会返回有关批处理作业的信息，包括您可以在 S3 存储桶中查找的输出和错误文件名。

要查看运用不同方法使用 OpenAI 列出批处理 API 的示例，请选择与您的首选方法对应的选项卡，然后按照以下步骤操作：

------
#### [ OpenAI SDK (Python) ]

要使用 OpenAI SDK 列出批处理作业，请执行以下操作：

1. 导入 OpenAI SDK 并使用以下字段设置客户端：
   + `base_url` – 将 Amazon Bedrock 运行时端点添加为 `/openai/v1` 的前缀，格式如下：

     ```
     https://${bedrock-runtime-endpoint}/openai/v1
     ```
   + `api_key` – 指定 Amazon Bedrock API 密钥。
   + `default_headers` – 如果需要包含任何标头，可将其作为键值对包含在此对象中。也可在进行特定 API 调用时在 `extra_headers` 中指定标头。

1. 在客户端上使用 [batches.list()](https://platform.openai.com/docs/api-reference/batch/list) 方法。您可以包括任何可选参数。

在运行以下示例之前，请先替换以下字段中的占位符：
+ api\$1key — 替换*\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK*为实际的 API 密钥。

以下示例在 `us-west-2` 中调用 OpenAI 列出批处理作业 API，并指定返回的结果上限为 2 个。

```
from openai import OpenAI

client = OpenAI(
    base_url="https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1", 
    api_key="$AWS_BEARER_TOKEN_BEDROCK" # Replace with actual API key
)

job = client.batches.list(limit=2)

print(job)
```

------
#### [ HTTP request ]

要通过直接 HTTP 请求列出批处理作业，请执行以下操作：

1. 使用 GET 方法，并通过将 Amazon Bedrock 运行时端点添加为 `/openai/v1/batches` 的前缀来指定 URL，格式如下：

   ```
   https://${bedrock-runtime-endpoint}/openai/v1/batches
   ```

   您可以包括任何可选的查询参数。

1. 在`Authorization`标题中指定您的 AWS 证书或 Amazon Bedrock API 密钥。

在运行以下示例之前，请先替换以下字段中的占位符：
+ 授权-*\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* 替换为实际的 API 密钥。

以下示例在 `us-west-2` 中调用 OpenAI 列出批处理 API，并指定返回的结果上限为 2 个。

```
curl -X GET 'https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1/batches?limit=2' \
    -H 'Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK' \
```

------

## 取消 OpenAI 批处理作业
<a name="inference-openai-batch-cancel"></a>

有关 OpenAI 取消批处理 API 请求和响应的详细信息，请参阅[取消批处理](https://platform.openai.com/docs/api-reference/batch/cancel)。响应会返回有关已取消的批处理作业的信息。

在发出请求时，您需要指定要取消的批处理作业的 ID。

要查看运用不同方法使用 OpenAI 取消批处理 API 的示例，请选择与您的首选方法对应的选项卡，然后按照以下步骤操作：

------
#### [ OpenAI SDK (Python) ]

要使用 OpenAI SDK 取消批处理作业，请执行以下操作：

1. 导入 OpenAI SDK 并使用以下字段设置客户端：
   + `base_url` – 将 Amazon Bedrock 运行时端点添加为 `/openai/v1` 的前缀，格式如下：

     ```
     https://${bedrock-runtime-endpoint}/openai/v1
     ```
   + `api_key` – 指定 Amazon Bedrock API 密钥。
   + `default_headers` – 如果需要包含任何标头，可将其作为键值对包含在此对象中。也可在进行特定 API 调用时在 `extra_headers` 中指定标头。

1. 在客户端上使用 [batches.cancel()](https://platform.openai.com/docs/api-reference/batch/cancel) 方法，并指定要检索其信息的批处理作业的 ID。

在运行以下示例之前，请先替换以下字段中的占位符：
+ api\$1key — 替换*\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK*为实际的 API 密钥。
+ batch\$1id — 替换为实际的 *\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* API 密钥。

该示例在 ID 为的批处理作业`us-west-2`上调用 “OpenAI取消批处理作业 API” *batch\$1abc123*。

```
from openai import OpenAI

client = OpenAI(
    base_url="https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1", 
    api_key="$AWS_BEARER_TOKEN_BEDROCK" # Replace with actual API key
)

job = client.batches.cancel(batch_id="batch_abc123") # Replace with actual ID

print(job)
```

------
#### [ HTTP request ]

要直接通过 HTTP 请求取消批处理作业，请执行以下操作：

1. 使用 POST 方法，并通过将 Amazon Bedrock 运行时端点添加为 `/openai/v1/batches/${batch_id}/cancel` 的前缀来指定 URL，格式如下：

   ```
   https://${bedrock-runtime-endpoint}/openai/v1/batches/batch_abc123/cancel
   ```

1. 在`Authorization`标题中指定您的 AWS 证书或 Amazon Bedrock API 密钥。

在运行以下示例之前，请先替换以下字段中的占位符：
+ 授权-*\$1AWS\$1BEARER\$1TOKEN\$1BEDROCK* 替换为实际的 API 密钥。
+ batch\$1abc123 – 在路径中，将此值替换为批处理作业的实际 ID。

以下示例在 ID 为的批处理作业中`us-west-2`调用OpenAI取消批处理 API *batch\$1abc123*。

```
curl -X GET 'https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1/batches/batch_abc123/cancel' \
    -H 'Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK'
```

------

# 通过跨区域推理提高吞吐量
<a name="cross-region-inference"></a>

使用跨区域推理时，您可以选择与特定地理位置（例如美国或欧洲）关联的跨区域推理配置文件，也可以选择全球推理配置文件。当您选择与特定地理位置相关的推理配置文件时，Amazon Bedrock 会自动选择该地理区域 AWS 区域 内的最佳商业信息来处理您的推理请求。使用全球推理配置文件时，Amazon Bedrock 会自动选择优异的商业 AWS 区域 来处理请求，从而优化可用资源并提高模型吞吐量。

这两种类型的跨区域推理都通过推理配置文件起作用，[推理配置文件](inference-profiles.md)定义了基础模型 (FM) 和可以将请求 AWS 区域 路由到哪里。在按需模式下运行模型推理时，您的请求可能会受到服务配额的限制或在高峰使用时间受到限制。跨区域推理使您能够通过利用跨不同区域的计算来无缝管理计划外的流量爆发。 AWS 区域

您还可以通过购买[预调配吞吐量](prov-throughput.md)来增加模型的吞吐量。推理配置文件目前不支持预调配吞吐量。

要查看可以使用推理配置文件运行跨区域推理的区域和模型，请参阅[支持推理配置文件的区域和模型](inference-profiles-support.md)。

**Topics**
+ [在地理和全球跨区域推理之间进行选择](#cross-region-inference-comparison)
+ [一般注意事项](#cross-region-inference-general-considerations)
+ [地理跨区域推理](geographic-cross-region-inference.md)
+ [全球跨区域推理](global-cross-region-inference.md)

## 在地理和全球跨区域推理之间进行选择
<a name="cross-region-inference-comparison"></a>

Amazon Bedrock 提供两种类型的跨区域推理配置文件，每种配置文件均针对不同的用例和合规性要求而设计：


| 功能 | 地理跨区域推理 | 全球跨区域推理 | 建议 | 
| --- | --- | --- | --- | 
| 数据驻留 | 在地理范围内（美国、欧盟、亚太地区等） | 全球任何支持的 AWS 商业区域 | 选择地理位置以满足合规性要求 | 
| 吞吐量 | 高于单一区域 | 可用的最高值 | 选择 “全局” 以获得最佳性能 | 
| 成本 | 标准定价 | 节省大约 10% | 选择 “全球” 进行成本优化 | 
| SCP 要求 | 允许配置文件中的所有目标区域 | 允许 "aws:RequestedRegion": "unspecified" | 根据您的组织策略进行配置 | 
| 最适合 | 有数据驻留规定的组织 | 优先考虑成本和性能的组织 | 评估您的合规和绩效需求 | 

如果您有数据驻留要求并且需要确保数据处理保持在特定的地理边界内，请选择地理跨区域推理。如果您想在不受地理限制的情况下最大限度地提高吞吐量并节省成本，请选择全球跨区域推理。

## 一般注意事项
<a name="cross-region-inference-general-considerations"></a>

请注意以下有关跨区域推理的信息：
+ 使用跨区域推理不会产生额外的路由成本。价格是根据您调用推理配置文件的区域计算得出的。有关定价的信息，请参阅 [Amazon Bedrock 定价](https://aws.amazon.com/bedrock/pricing/)。
+ 跨区域推理可以将请求路由到您 AWS 区域 中未手动启用的区域。 AWS 账户跨区域推理无需手动启用区域即可起作用。
+ 跨区域操作期间传输的所有数据都保留在 AWS 网络上，不会通过公共互联网。数据在两者之间传输时会被加密 AWS 区域。
+ 所有跨区域推理请求都将 CloudTrail 登录到您的源区域。查找该`additionalEventData.inferenceRegion`字段以确定请求的处理位置。
+ AWS 由 Amazon Bedrock 提供支持的服务也可能使用 CRIS。有关详细信息，请参阅特定于服务的文档。

# 地理跨区域推理
<a name="geographic-cross-region-inference"></a>

地理跨区域推理可将数据处理保持在指定的地理边界（美国、欧盟、亚太地区等）内，同时提供比单区域推理更高的吞吐量。此选项非常适合有数据驻留要求和合规性法规的组织。

## 地理跨区域推理注意事项
<a name="geographic-cris-considerations"></a>

请注意以下有关地理跨区域推断的信息：
+ 向与地理位置（例如美国、欧盟和亚太地区）关联的推理配置文件发出的跨区域推理请求保存在数据最初所在的地理区域内。 AWS 区域 例如，在美国境内提出的请求保存在美国境内。 AWS 区域 尽管数据仍然只存储在源区域中，但在跨区域推理期间，您的输入提示和输出结果可能会传出源区域。所有数据都将通过 Amazon 的安全网络进行加密传输。
+ 在使用与地理位置（例如美国、欧洲和亚太地区）关联的推理配置文件时，要查看跨区域吞吐量的默认配额，请在《AWS 一般参考》**中，参阅 [Amazon Bedrock 服务配额](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#limits_bedrock)中的 **\$1\$1Model\$1 的每分钟跨区域模型推理请求数**和 **\$1\$1Model\$1 的每分钟跨区域模型推理词元数**值。

## 地理跨区域推断的 IAM 政策要求
<a name="geographic-cris-iam-setup"></a>

要允许 IAM 用户或角色调用地理跨区域推理配置文件，您需要允许访问以下资源：

1. 特定于地理位置的跨区域推理配置文件（这些配置文件具有地理前缀，例如、、）`us``eu``apac`

1. 源区域中的基础模型

1. 地理概况中列出的所有目的地区域的基础模型

以下示例策略授予使用带有美国地理跨区域推理配置文件的 Claude Sonnet 4.5 基础模型所需的权限，其中源区域为，目标区域为`us-east-1`，`us-east-1`以及：`us-east-2``us-west-2`

```
{
    "Version": "2012-10-17"		 	 	 ,
    "Statement": [
        {
            "Sid": "GrantGeoCrisInferenceProfileAccess",
            "Effect": "Allow",
            "Action": "bedrock:InvokeModel",
            "Resource": [
                "arn:aws:bedrock:us-east-1:<ACCOUNT_ID>:inference-profile/us.anthropic.claude-sonnet-4-5-20250929-v1:0"
            ]
        },
        {
            "Sid": "GrantGeoCrisModelAccess",
            "Effect": "Allow",
            "Action": "bedrock:InvokeModel",
            "Resource": [
                "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-sonnet-4-5-20250929-v1:0",
                "arn:aws:bedrock:us-east-2::foundation-model/anthropic.claude-sonnet-4-5-20250929-v1:0",
                "arn:aws:bedrock:us-west-2::foundation-model/anthropic.claude-sonnet-4-5-20250929-v1:0"
            ],
            "Condition": {
                "StringEquals": {
                    "bedrock:InferenceProfileArn": "arn:aws:bedrock:us-east-1:<ACCOUNT_ID>:inference-profile/us.anthropic.claude-sonnet-4-5-20250929-v1:0"
                }
            }
        }
    ]
}
```

第一条语句授予来自请求区域的请求的 `bedrock:InvokeModel` API 访问地理跨区域推理配置文件的权限。第二条语句授予 `bedrock:InvokeModel` API 对推理配置文件中列出的请求区域和所有目标区域中的基础模型的访问权限。

## 地理跨区域推理的服务控制策略要求
<a name="geographic-cris-scp-setup"></a>

为了安全性和合规性，许多组织通过 Organization AWS s 中的服务控制策略实施区域访问控制。如果贵组织的安全策略用于 SCPs 屏蔽未使用的区域，则必须确保特定于区域的 SCP 条件允许访问来源区域的地理跨区域推理配置文件中列出的所有目标区域。

要进行地理跨区域推断，您需要了解源区域（您发出 API 调用的地方）和目标区域（可以路由请求的地方）之间的关系。请查看推理配置文件文档，确定源区域的所有目标区域，然后确保 SCPs 允许访问所有这些目标区域。

例如，如果你使用美国 Anthropic Claude Sonnet 4.5 地理配置文件从 us-east-1（来源区域）拨打电话，则可以将请求路由到 us-east-1、us-east-2 和 us-west-2（目标区域）。如果 SCP 限制只能访问 us-east-1，则在尝试路由到 us-east-2 或 us-west-2 时，跨区域推断将失败。因此，无论您从哪个区域拨打电话，都需要在 SCP 中允许所有三个目的地区域。

在配置 SCPs 区域排除时，请记住，即使您的源区域仍然可以访问，在推理配置文件中屏蔽任何目标区域也会使跨区域推理无法正常运行。有关全球跨区域推理的 SCP 要求，请参阅。[全球跨区域推理的服务控制策略要求](global-cross-region-inference.md#global-cris-scp-setup)

为了提高安全性，可以考虑使用`bedrock:InferenceProfileArn`条件来限制对特定推理配置文件的访问。这允许您授予对所需区域的访问权限，同时限制可以使用哪些推理配置文件。

## 使用地理跨区域推理
<a name="geographic-cris-usage"></a>

要使用地理跨区域推理，请在通过以下方式运行模型[推理时包含推理配置文件](inference-profiles.md)：
+ **按需模型推理** — 在发送、、C [onverse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) 或请求`modelId`时 [InvokeModel[InvokeModelWithResponseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html)](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html)，将推理配置文件的 ID 指定为。[ConverseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ConverseStream.html)推理配置文件定义了一个或多个区域，以将来自您的源区域的推理请求路由到这些区域。使用跨区域推理时，系统可以跨推理配置文件中定义的区域动态路由模型调用请求，从而提高吞吐量。路由会考虑用户流量、需求和资源利用率。有关更多信息，请参阅 [使用模型推理提交提示并生成响应](inference.md)。
+ **Batch In** ference — 通过在发送请求`modelId`时指定推理配置文件的 ID，使用批量推理异步提交请求。[CreateModelInvocationJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateModelInvocationJob.html)使用推理配置文件，您可以跨多个 AWS 区域 利用计算资源，从而缩短批处理作业的处理时间。作业完成后，您可以从源区域中的 Amazon S3 存储桶检索输出文件。
+ **代理** – 在 [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_CreateAgent.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_CreateAgent.html) 请求的 `foundationModel` 字段中指定推理配置文件的 ID。有关更多信息，请参阅 [手动创建和配置代理](agents-create.md)。
+ **知识库响应生成** – 在查询知识库后生成响应时，您可以使用跨区域推理。有关更多信息，请参阅 [利用查询和响应测试知识库](knowledge-base-test.md)。
+ **模型评测** – 在提交模型评测作业时，可以将推理配置文件作为模型进行评测。有关更多信息，请参阅 [评测 Amazon Bedrock 资源的性能](evaluation.md)。
+ **提示管理器** – 在为在提示管理器中创建的提示生成响应时，可以使用跨区域推理。有关更多信息，请参阅 [在 Amazon Bedrock 中使用提示管理器构建和存储可重复使用的提示](prompt-management.md)。
+ **提示流** – 为在提示工作流的提示节点内定义的内联提示生成响应时，可以使用跨区域推理。有关更多信息，请参阅 [使用 Amazon Bedrock Flows 构建 end-to-end生成式 AI 工作流程](flows.md)。

要了解如何使用推理配置文件跨区域发送模型调用请求，请参阅[在模型调用中使用推理配置文件](inference-profiles-use.md)。

要了解有关跨区域推理的更多信息，请参阅[开始使用 Amazon Bedrock 中的跨区域推理](https://aws.amazon.com/blogs/machine-learning/getting-started-with-cross-region-inference-in-amazon-bedrock/)。

有关全球跨区域推断（包括 IAM 设置和服务配额管理）的详细信息，请参阅。[全球跨区域推理](global-cross-region-inference.md)

# 全球跨区域推理
<a name="global-cross-region-inference"></a>

全球跨区域推理将跨区域推理扩展到地理边界之外，从而能够将推理请求路由到 AWS 区域 全球支持的商业，优化可用资源并实现更高的模型吞吐量。

## 全球跨区域推理的好处
<a name="global-cris-benefits"></a>

Anthropic 的 Claude Sonnet 4.5 的全球跨区域推理与传统的地理跨区域推理配置文件相比，具有多种优势：
+ 在@@ **需求高峰期提高吞吐量** — 全球跨区域推理通过自动将请求路由到可用容量，从而提高需求高峰期 AWS 区域 的弹性。这种动态路由可以无缝进行，无需开发人员的额外配置或干预。与可能需要在客户端之间进行复杂的负载平衡的传统方法不同 AWS 区域，全局跨区域推理会自动处理流量峰值。这对于业务关键型应用程序尤其重要，在这些应用程序中，停机或性能下降可能会对财务或声誉造成重大影响。
+ **成本效益** — 与地理跨区域推断相比，Anthropic的Claude Sonnet 4.5的全球跨区域推理在输入和输出代币定价方面节省了约10％。价格是根据提出请求 AWS 区域 的来源（来源 AWS 区域）计算的。这意味着组织可以以更低的成本从更高的弹性中受益。对于希望优化生成式 AI 部署的组织来说，这种定价模式使全球跨区域推理成为一种经济实惠的解决方案。通过提高资源利用率和在不增加成本的情况下实现更高的吞吐量，它可以帮助组织最大限度地提高对 Amazon Bedrock 的投资价值。
+ **简化监控**-使用全球跨区域推理时， CloudWatch CloudTrail 继续在源中记录日志条目 AWS 区域，从而简化可观察性和管理。尽管您的请求是在 AWS 区域 全球不同地区处理的，但您可以通过熟悉的 AWS 监控工具集中查看应用程序的性能和使用模式。
+ **按需配额灵活性** — 借助全球跨区域推理，您的工作负载不再受单个区域容量的限制。您的请求可以在 AWS 全球基础架构中动态路由 AWS 区域，而不必局限于特定可用容量。这提供了对更大资源池的访问权限，从而降低了处理高容量工作负载和突然流量峰值的复杂性。

## 全球跨区域推理注意事项
<a name="global-cris-considerations"></a>

请注意以下有关全球跨区域推理的信息：
+ 相比与特定地理位置关联的推理配置文件，全球跨区域推理配置文件提供的吞吐量更高。相比单区域推理，与特定地理位置关联的推理配置文件提供的吞吐量更高。
+ 在使用全球推理配置文件时，要查看跨区域吞吐量的默认配额，请在《AWS 一般参考》**中，参阅 [Amazon Bedrock 服务配额](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#limits_bedrock)中的 **\$1\$1Model\$1 的每分钟全球跨区域模型推理请求数**和 **\$1\$1Model\$1 的每分钟全球跨区域模型推理词元数**值。

  您可以从 Service Quotas [控制台或在**源**区域中使用 CL AWS I 命令请求、查看和管理全球跨区域推理配置文件的配额](https://console.aws.amazon.com/servicequotas/home/services/bedrock/quotas)。

## 全球跨区域推理的 IAM 政策要求
<a name="global-cris-iam-setup"></a>

要为您的用户启用全球跨区域推理，您必须对该角色应用由三部分组成的 IAM 策略。以下是提供精细控制的 IAM 策略示例。您可以将示例策略替换`<REQUESTING REGION>`为 AWS 区域 正在使用的策略。

```
{
    "Version": "2012-10-17"		 	 	 ,
    "Statement": [
        {
            "Sid": "GrantGlobalCrisInferenceProfileRegionAccess",
            "Effect": "Allow",
            "Action": "bedrock:InvokeModel",
            "Resource": [
                "arn:aws:bedrock:<REQUESTING REGION>:<ACCOUNT>:inference-profile/global.<MODEL NAME>"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:RequestedRegion": "<REQUESTING REGION>"
                }
            }
        },
        {
            "Sid": "GrantGlobalCrisInferenceProfileInRegionModelAccess",
            "Effect": "Allow",
            "Action": "bedrock:InvokeModel",
            "Resource": [
                "arn:aws:bedrock:<REQUESTING REGION>::foundation-model/<MODEL NAME>"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:RequestedRegion": "<REQUESTING REGION>",
                    "bedrock:InferenceProfileArn": "arn:aws:bedrock:<REQUESTING REGION>:<ACCOUNT>:inference-profile/global.<MODEL NAME>"
                }
            }
        },
        {
            "Sid": "GrantGlobalCrisInferenceProfileGlobalModelAccess",
            "Effect": "Allow",
            "Action": "bedrock:InvokeModel",
            "Resource": [
                "arn:aws:bedrock:::foundation-model/<MODEL NAME>"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:RequestedRegion": "unspecified",
                    "bedrock:InferenceProfileArn": "arn:aws:bedrock:<REQUESTING REGION>:<ACCOUNT>:inference-profile/global.<MODEL NAME>"
                }
            }
        }
    ]
}
```

该政策的第一部分允许访问您请求 AWS 区域中的区域推理配置文件。第二部分提供对区域调频资源的访问。第三部分授予对全球 FM 资源的访问权限，该资源支持跨区域路由功能。

在实施这些政策时，请确保所有三个资源 Amazon 资源名称 (ARNs) 都包含在您的 IAM 声明中：
+ 区域推理配置文件 ARN 遵循这种模式。`arn:aws:bedrock:REGION:ACCOUNT:inference-profile/global.MODEL-NAME`这用于提供对源代码 AWS 区域中的全局推理配置文件的访问权限。
+ 区域调频使用`arn:aws:bedrock:REGION::foundation-model/MODEL-NAME`. 这用于访问源中的 FM AWS 区域。
+ 全球调频要求`arn:aws:bedrock:::foundation-model/MODEL-NAME`. 这用于在不同的全球范围内访问FM AWS 区域。

全球 FM ARN 没有指定 AWS 区域 或账户，这是故意的，也是跨区域功能所必需的。

### 禁用全局跨区域推理
<a name="global-cris-iam-disable"></a>

您可以从两种主要方法中进行选择，针对特定 IAM 角色向全球 CRI 实施拒绝策略，每种方法都有不同的用例和含义：
+ **移除 IAM 策略** — 第一种方法涉及从用户权限中移除三个必需的 IAM 策略中的一个或多个。由于全局 CRIS 要求所有三个策略都发挥作用，因此删除策略将导致访问被拒绝。
+ **实施拒绝策略** — 第二种方法是实施明确的拒绝策略，专门针对全球 CRIS 推理配置文件。此方法可以清楚地记录您的安全意图，并确保即使有人稍后不小心添加了所需的允许策略，也将优先考虑显式拒绝。拒绝策略应使用与模式匹配的`StringEquals`条件`"aws:RequestedRegion": "unspecified"`。这种模式专门针对带有`global`前缀的推理配置文件。

在实施拒绝策略时，至关重要的是要了解全球 CRIS 会改变该`aws:RequestedRegion`领域的行为方式。 AWS 区域基于传统的拒绝策略使用具有特定 AWS 区域 名称的`StringEquals`条件（例如），在全局 CRIS 中`"aws:RequestedRegion": "us-west-2"`将无法按预期运行，因为服务将此字段设置为`global`而不是实际目的地 AWS 区域。但是，如前所述，`"aws:RequestedRegion": "unspecified"`将产生拒绝效果。

## 全球跨区域推理的服务控制策略要求
<a name="global-cris-scp-setup"></a>

对于全球跨区域推断，如果贵组织的安全策略用于屏蔽未使用的区域，则必须更新特定于区域的 SCP 条件 SCPs 以允许访问。`"aws:RequestedRegion": "unspecified"`此条件特定于 Amazon Bedrock 全球跨区域推理，可确保请求可以路由到所有支持的商业区域。 AWS 

以下示例 SCP 会阻止已批准区域之外的所有 AWS API 调用，同时允许使用`"unspecified"`作为全球路由区域的 Amazon Bedrock 全球跨区域推理调用：

```
{
    "Version": "2012-10-17"		 	 	 ,
    "Statement": [
        {
            "Sid": "DenyAllOutsideApprovedRegions",
            "Effect": "Deny",
            "Action": "*",
            "Resource": "*",
            "Condition": {
                "StringNotEquals": {
                    "aws:RequestedRegion": [
                        "us-east-1",
                        "us-east-2",
                        "us-west-2",
                        "unspecified"
                    ]
                }
            }
        }
    ]
}
```

### 禁用全局跨区域推理
<a name="global-cris-disable"></a>

有数据驻留或合规要求的组织应评估全球跨区域推理是否符合其合规框架，因为请求可能会在其他支持的 AWS 商业区域处理。要明确禁用全球跨区域推理，请实施以下 SCP 策略：

```
{
    "Effect": "Deny",
    "Action": "bedrock:*",
    "Resource": "*",
    "Condition": {
        "StringEquals": {
            "aws:RequestedRegion": "unspecified"
        },
        "ArnLike": {
            "bedrock:InferenceProfileArn": "arn:aws:bedrock:*:*:inference-profile/global.*"
        }
    }
}
```

此 SCP 明确拒绝全球跨区域推断，因为 `"aws:RequestedRegion"` is `"unspecified"` 和`"ArnLike"`条件以 ARN `global` 中带有前缀的推理配置文件为目标。

### AWS Control Tower 的实现
<a name="control-tower-scp"></a>

强烈建议不要手动编辑 Cont AWS rol Tower SCPs 管理的内容，因为这可能会导致偏差。相反，请使用 Control Tower 提供的机制来管理这些异常。核心原则包括扩展现有的区域拒绝控制或启用区域，然后应用自定义的有条件屏蔽策略。

有关使用 Control Tower 实现跨区域推理的详细 step-by-step指南，请参阅博客文章在多账户环境中启用 [Amazon Bedrock 跨区域推](https://aws.amazon.com/blogs/machine-learning/enable-amazon-bedrock-cross-region-inference-in-multi-account-environments/)理。这包括扩展现有区域拒绝 SCPs、使用自定义启用拒绝区域 SCPs，以及使用 Cont AWS rol Tower 的自定义 (cfcT) 将自定义部署 SCPs 为基础架构即代码。

## 提高了全球跨区域推理的请求限制
<a name="global-cris-quotas"></a>

使用全局 CRIS 推理配置文件时，您可以使用来自 20 多个支持的来源的全局 CRIS。 AWS 区域由于这将是一个全球限制，因此查看、管理或增加全球跨区域推理配置文件配额的请求必须通过请求来源中的 Service Quotas 控制台或 AWS 命令行界面 (C AWS LI) 提出。 AWS 区域

要申请提高限额，请完成以下步骤：

1. 使用您的 AWS 账户登录 Service Quotas 控制台。

1. 在导航窗格中，选择 **AWS 服务**。

1. 从服务列表中找到并选择 **Amazon Bedrock**。

1. 在 Amazon Bedrock 的配额列表中，使用搜索筛选器查找特定的全球 CRIS 配额。例如：
   + Anthropic Claude Sonnet 4.5 V1 每分钟全球跨区域模型推理令牌

1. 选择要增加的配额。

1. 选择**请求增加账户配额**。

1. 输入您想要的新配额值。

1. 选择**请求**，提交您的请求。

在计算所需的配额增加时，请记住要考虑消耗率，消耗率定义为限制系统的输入和输出代币转换为令牌配额使用量的比率。以下模型的输出代币**消耗率为 5 倍（1 个输出代币消耗配额中的 5 个代币**）：
+ 人类 Claude Opus 4
+ 人类 Claude Sonnet 4.5
+ 人类 Claude Sonnet 4
+ Anthropic Claude 3.7

对于所有其他模型，消耗比率为 **1:1**（在配额中，1 个输出词元消耗 1 个词元）。对于输入代币，代币与配额的比例为 1:1。每次请求的代币总数的计算方法如下：

`Input token count + Cache write input tokens + (Output token count x Burndown rate)`

## 使用全球跨区域推理
<a name="global-cris-usage"></a>

要在 Anthropic 的 Claude Sonnet 4.5 中使用全局跨区域推理，开发人员必须完成以下关键步骤：
+ **使用全局推理配置文件 ID — 在**对 Amazon Bedrock 进行 API 调用时，请指定全球 Anthropic 的 Claude Sonnet 4.5 推理配置文件 ID (`global.anthropic.claude-sonnet-4-5-20250929-v1:0`)，而不是特定的模型 ID。 AWS 区域
+ **配置 IAM 权限** — 授予相应的 IAM 权限，以访问推理配置文件和 FMs 进入潜在目的地 AWS 区域。

支持以下各项的全球跨区域推理：
+ 按需模型推理
+ 批量推理
+ 座席
+ 模型评测
+ 提示管理器
+ 提示流

**注意**  
按需模型推理、批量推理、代理、模型评测、提示管理器和提示流支持全球推理配置文件。

## 实现全球跨区域推理
<a name="global-cris-implementation"></a>

使用Anthropic的Claude Sonnet 4.5实现全球跨区域推理非常简单，只需要对现有的应用程序代码进行一些更改即可。以下是如何在 Python 中更新代码的示例：

```
import boto3
import json
bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')
model_id = "global.anthropic.claude-sonnet-4-5-20250929-v1:0"  
response = bedrock.converse(
    messages=[{"role": "user", "content": [{"text": "Explain cloud computing in 2 sentences."}]}],
    modelId=model_id,
)

print("Response:", response['output']['message']['content'][0]['text'])
print("Token usage:", response['usage'])
print("Total tokens:", response['usage']['totalTokens'])
```

# 使用推理配置文件设置模型调用资源
<a name="inference-profiles"></a>

*推理配置文件*是 Amazon Bedrock 中的一种资源，用于定义模型以及推理配置文件可以将模型调用请求路由至的一个或多个区域。推理配置文件可用于以下任务：
+ **跟踪使用情况指标**-设置 CloudWatch 日志并提交带有应用程序推理配置文件的模型调用请求，以收集模型调用的使用量指标。在查看有关推理配置文件的信息时，您可以检查这些指标，并将其用于为您的决策提供依据。有关如何设置 CloudWatch 日志的更多信息，请参阅[使用 CloudWatch 日志和 Amazon S3 监控模型调用](model-invocation-logging.md)。
+ **使用标签监控成本** – 在提交按需模型调用请求时，将标签附加到应用程序推理配置文件以追踪成本。有关如何使用标签进行成本分配的更多信息，请参阅 AWS Billing 用户指南中的[使用 AWS 成本分配标签组织和跟踪成本](https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/cost-alloc-tags.html)。
+ **跨区域推理** – 使用包含多个 AWS 区域的推理配置文件来提高吞吐量。推理配置文件将在这些区域之间分配模型调用请求，以提高吞吐量和性能。有关跨区域推理的更多信息，请参阅[通过跨区域推理提高吞吐量](cross-region-inference.md).

Amazon Bedrock 提供了以下类型的推理配置文件：
+ **跨区域（系统定义）推理配置文件** – Amazon Bedrock 中预定义的推理配置文件，包括模型请求可以路由至的多个区域。
+ **应用程序推理配置文件** – 用户为追踪成本和模型使用情况而创建的推理配置文件。您可以创建推理配置文件，将模型调用请求路由至一个或多个区域：
  + 要创建推理配置文件，以便在某个区域中追踪模型的成本和使用情况，请在您希望推理配置文件将请求路由至的区域中指定基础模型。
  + 要创建推理配置文件，以便跨多个区域追踪模型的成本和使用情况，请指定跨区域（系统定义）推理配置文件，来定义模型以及您希望推理配置文件将请求路由至的区域。

您可以将推理配置文件与以下功能结合使用，将请求路由至多个区域，并追踪使用这些功能发出的调用请求的使用情况和成本：
+ [模型推理 — 在运行模型调用时使用推理配置文件，方法是在 Amazon Bedrock 控制台的操场中选择推理配置文件，或者在调用、、、Converse 和操作时指定推理配置文件的 ARN。[InvokeModel[InvokeModelWithResponseStream[ConverseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ConverseStream.html)](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html)](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html)](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html)有关更多信息，请参阅 [使用模型推理提交提示并生成响应](inference.md)。
+ 知识库向量嵌入和响应生成 – 在查询知识库后生成响应时，或在解析数据来源中的非文本信息时，可以使用推理配置文件。有关更多信息，请参阅[利用查询和响应测试知识库](knowledge-base-test.md)和[数据来源的解析选项](kb-advanced-parsing.md)。
+ 模型评测 – 在提交模型评测作业时，可以将推理配置文件作为模型进行评测。有关更多信息，请参阅[评测 Amazon Bedrock 资源的性能](evaluation.md)。
+ 提示管理器 – 为在提示管理器中创建的提示生成响应时，可以使用推理配置文件。有关更多信息，请参阅[在 Amazon Bedrock 中使用提示管理器构建和存储可重复使用的提示](prompt-management.md)。
+ 流 – 为在流的提示节点内定义的内联提示生成响应时，可以使用推理配置文件。有关更多信息，请参阅[使用 Amazon Bedrock Flows 构建 end-to-end生成式 AI 工作流程](flows.md)。

使用推理配置文件的价格，是根据您调用推理配置文件所在区域的模型价格计算的。有关定价的信息，请参阅 [Amazon Bedrock 定价](https://aws.amazon.com/bedrock/pricing/)。

有关跨区域推理配置文件可提供的吞吐量的更多详细信息，请参阅[通过跨区域推理提高吞吐量](cross-region-inference.md)。

**Topics**
+ [支持推理配置文件的区域和模型](inference-profiles-support.md)
+ [推理配置文件的先决条件](inference-profiles-prereq.md)
+ [创建应用程序推理配置文件](inference-profiles-create.md)
+ [修改应用程序推理配置文件的标签](inference-profiles-modify.md)
+ [查看有关推理配置文件的信息](inference-profiles-view.md)
+ [在模型调用中使用推理配置文件](inference-profiles-use.md)
+ [删除应用程序推理配置文件](inference-profiles-delete.md)

# 支持推理配置文件的区域和模型
<a name="inference-profiles-support"></a>

有关 Amazon Bedrock 支持的区域代码和端点的列表，请参阅 [Amazon Bedrock 端点和配额](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#bedrock_region)。本主题介绍您可以使用的预定义推理配置文件，以及支持应用程序推理配置文件的区域和模型。

**Topics**
+ [支持的跨区域推理配置文件](#inference-profiles-support-system)
+ [支持应用程序推理配置文件的区域和模型](#inference-profiles-support-user)

## 支持的跨区域推理配置文件
<a name="inference-profiles-support-system"></a>

您可以使用跨区域（系统定义的）推理配置文件执行[跨区域推理](cross-region-inference.md)。跨区域推理允许您通过利用不同区域的计算来无缝管理计划外的流量爆发。 AWS 区域利用跨区域推理功能，您可以跨多个 AWS 区域分配流量。

跨区域（系统定义）的推理配置文件以其支持的模型命名，并按照所支持的区域进行定义。要了解跨区域推理配置文件如何处理您的请求，请查看以下定义：
+ **源区域** – 发出指定了推理配置文件的 API 请求的区域。
+ **目标区域** – Amazon Bedrock 服务可以将请求从源区域路由到的区域。

当您在 Amazon Bedrock 中调用跨区域推理配置文件时，您的请求来自源区域，该请求会自动路由到该配置文件中定义的目标区域之一，并对性能进行优化。全球跨区域推理配置文件的目标区域包括所有商业区域。

**注意**  
跨区域推理配置文件中的目标区域可以包括可*选区域*，即您必须在 AWS 账户 或组织级别明确启用的区域。要了解更多信息，请参阅[AWS 区域 在您的账户中启用或禁用](https://docs.aws.amazon.com/accounts/latest/reference/manage-acct-regions.html)。使用跨区域推理配置文件时，您的推理请求可以路由到配置文件中的任何目标区域，即使您没有在账户中选择加入此类区域也是如此。

服务控制策略 (SCPs) 和 AWS Identity and Access Management (IAM) 策略协同工作，以控制允许跨区域推断的位置。使用 SCPs，您可以控制 Amazon Bedrock 可以使用哪些区域进行推理；使用 IAM 策略，您可以定义哪些用户或角色有权运行推理。如果您的跨区域推理配置文件中的任何目标区域被屏蔽 SCPs，则即使其他区域仍被允许，请求也会失败。为了确保跨区域推理的高效运行，您可以更新您的 SCPs 和 IAM 策略，以允许在您选择的推理配置文件中包含的所有目标区域执行所有必需的 Amazon Bedrock 推理操作（例如，`bedrock:InvokeModel*`或`bedrock:CreateModelInvocationJob`）。要了解更多信息，请参阅[在多账户环境中启用 Amazon Bedrock 跨区域推理](https://aws.amazon.com/blogs/machine-learning/enable-amazon-bedrock-cross-region-inference-in-multi-account-environments/)。

**注意**  
根据您发出调用的源区域，某些推理配置文件会路由到不同的目标区域。例如，如果您从美国东部（俄亥俄州）调用 `us.anthropic.claude-3-haiku-20240307-v1:0`，它可以将请求路由到 `us-east-1`、`us-east-2` 或 `us-west-2`；但是如果您从美国西部（俄勒冈州）进行该调用，它只会将请求路由到 `us-east-1` 和 `us-west-2`。

要查看推理配置文件的源区域与目标区域，您可以执行下列操作之一：
+ 展开[支持的跨区域推理配置文件列表](#inference-profiles-support)中的相应部分。
+ 使用来自来源区域的 A [mazon Bedrock 控制平面终端节点](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)发送[GetInferenceProfile](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetInferenceProfile.html)请求，并在字段中指定推理配置文件的亚马逊资源名称 (ARN) 或 ID。`inferenceProfileIdentifier`响应中的`models`字段映射到模型列表 ARNs，您可以在其中标识每个目标区域。

**注意**  
随着更多可以处理请求的商业区域的 AWS 增加，特定模型的全球跨区域推理配置文件可能会随着时间的推移而发生变化。但是，如果推理配置文件与地理位置（例如美国、欧盟或亚太地区）相关联，则其目标区域列表将永远不会更改。 AWS 可能会创建包含新区域的新推理配置文件。您可以通过将设置 IDs 中的更改为新的推理配置文件来更新系统以使用这些推理配置文件。  
全球跨区域推理配置文件目前仅在以下源区域中支持 Anthropic Claude Sonnet 4 模型：美国西部（俄勒冈州）、美国东部（弗吉尼亚州北部）、美国东部（弗吉尼亚州北部）、美国东部（俄亥俄州）、欧洲地区（爱尔兰）和亚太地区（东京）。全球推理配置文件的目标区域包括所有商业 AWS 区域。

分别展开以下各部分，查看有关跨区域推理配置文件、可在其中调用该配置文件的源区域，以及该配置文件可将请求路由到的目标区域的信息。

### 全球亚马逊 Nova 2 Lite
<a name="cross-region-ip-global.amazon.nova-2-lite-v1:0"></a>

要调用全球 Amazon Nova 2 Lite 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
global.amazon.nova-2-lite-v1:0
```

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-east-2 |  Commercial AWS Regions ap-east-2  | 
| ap-northeast-1 |  Commercial AWS Regions ap-northeast-1  | 
| ap-northeast-2 |  Commercial AWS Regions ap-northeast-2  | 
| ap-south-1 |  Commercial AWS Regions ap-south-1  | 
| ap-southeast-1 |  Commercial AWS Regions ap-southeast-1  | 
| ap-southeast-2 |  Commercial AWS Regions ap-southeast-2  | 
| ap-southeast-3 |  Commercial AWS Regions ap-southeast-3  | 
| ap-southeast-4 |  Commercial AWS Regions ap-southeast-4  | 
| ap-southeast-5 |  Commercial AWS Regions ap-southeast-5  | 
| ap-southeast-7 |  Commercial AWS Regions ap-southeast-7  | 
| ca-central-1 |  Commercial AWS Regions ca-central-1  | 
| ca-west-1 |  Commercial AWS Regions ca-west-1  | 
| eu-central-1 |  Commercial AWS Regions eu-central-1  | 
| eu-north-1 |  Commercial AWS Regions eu-north-1  | 
| eu-south-1 |  Commercial AWS Regions eu-south-1  | 
| eu-south-2 |  Commercial AWS Regions eu-south-2  | 
| eu-west-1 |  Commercial AWS Regions eu-west-1  | 
| eu-west-2 |  Commercial AWS Regions eu-west-2  | 
| eu-west-3 |  Commercial AWS Regions eu-west-3  | 
| il-central-1 |  Commercial AWS Regions il-central-1  | 
| me-central-1 |  Commercial AWS Regions me-central-1  | 
| us-east-1 |  Commercial AWS Regions us-east-1  | 
| us-east-2 |  Commercial AWS Regions us-east-2  | 
| us-west-1 |  Commercial AWS Regions us-west-1  | 
| us-west-2 |  Commercial AWS Regions us-west-2  | 

### 全球人类克劳德作品 4.5
<a name="cross-region-ip-global.anthropic.claude-opus-4-5-20251101-v1:0"></a>

要调用 GLOBAL Anthropic Claude Opus 4.5 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
global.anthropic.claude-opus-4-5-20251101-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| af-south-1 |  Commercial AWS Regions af-south-1  | 
| ap-east-2 |  Commercial AWS Regions ap-east-2  | 
| ap-northeast-1 |  Commercial AWS Regions ap-northeast-1  | 
| ap-northeast-2 |  Commercial AWS Regions ap-northeast-2  | 
| ap-northeast-3 |  Commercial AWS Regions ap-northeast-3  | 
| ap-south-1 |  Commercial AWS Regions ap-south-1  | 
| ap-south-2 |  Commercial AWS Regions ap-south-2  | 
| ap-southeast-1 |  Commercial AWS Regions ap-southeast-1  | 
| ap-southeast-2 |  Commercial AWS Regions ap-southeast-2  | 
| ap-southeast-3 |  Commercial AWS Regions ap-southeast-3  | 
| ap-southeast-4 |  Commercial AWS Regions ap-southeast-4  | 
| ap-southeast-5 |  Commercial AWS Regions ap-southeast-5  | 
| ap-southeast-7 |  Commercial AWS Regions ap-southeast-7  | 
| ca-central-1 |  Commercial AWS Regions ca-central-1  | 
| ca-west-1 |  Commercial AWS Regions ca-west-1  | 
| eu-central-1 |  Commercial AWS Regions eu-central-1  | 
| eu-central-2 |  Commercial AWS Regions eu-central-2  | 
| eu-north-1 |  Commercial AWS Regions eu-north-1  | 
| eu-south-1 |  Commercial AWS Regions eu-south-1  | 
| eu-south-2 |  Commercial AWS Regions eu-south-2  | 
| eu-west-1 |  Commercial AWS Regions eu-west-1  | 
| eu-west-2 |  Commercial AWS Regions eu-west-2  | 
| eu-west-3 |  Commercial AWS Regions eu-west-3  | 
| il-central-1 |  Commercial AWS Regions il-central-1  | 
| me-central-1 |  Commercial AWS Regions me-central-1  | 
| me-south-1 |  Commercial AWS Regions me-south-1  | 
| mx-central-1 |  Commercial AWS Regions mx-central-1  | 
| sa-east-1 |  Commercial AWS Regions sa-east-1  | 
| us-east-1 |  Commercial AWS Regions us-east-1  | 
| us-east-2 |  Commercial AWS Regions us-east-2  | 
| us-west-1 |  Commercial AWS Regions us-west-1  | 
| us-west-2 |  Commercial AWS Regions us-west-2  | 

### 全球 TwelveLabs 飞马 v1.2
<a name="cross-region-ip-global.twelvelabs.pegasus-1-2-v1:0"></a>

要调用 GLOBA TwelveLabs L Pegasus v1.2 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
global.twelvelabs.pegasus-1-2-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-pegasus.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| af-south-1 |  Commercial AWS Regions af-south-1  | 
| ap-east-2 |  Commercial AWS Regions ap-east-2  | 
| ap-northeast-1 |  Commercial AWS Regions ap-northeast-1  | 
| ap-northeast-2 |  Commercial AWS Regions ap-northeast-2  | 
| ap-northeast-3 |  Commercial AWS Regions ap-northeast-3  | 
| ap-south-1 |  Commercial AWS Regions ap-south-1  | 
| ap-south-2 |  Commercial AWS Regions ap-south-2  | 
| ap-southeast-1 |  Commercial AWS Regions ap-southeast-1  | 
| ap-southeast-2 |  Commercial AWS Regions ap-southeast-2  | 
| ap-southeast-3 |  Commercial AWS Regions ap-southeast-3  | 
| ap-southeast-4 |  Commercial AWS Regions ap-southeast-4  | 
| ap-southeast-5 |  Commercial AWS Regions ap-southeast-5  | 
| ap-southeast-7 |  Commercial AWS Regions ap-southeast-7  | 
| ca-central-1 |  Commercial AWS Regions ca-central-1  | 
| ca-west-1 |  Commercial AWS Regions ca-west-1  | 
| eu-central-1 |  Commercial AWS Regions eu-central-1  | 
| eu-central-2 |  Commercial AWS Regions eu-central-2  | 
| eu-north-1 |  Commercial AWS Regions eu-north-1  | 
| eu-south-1 |  Commercial AWS Regions eu-south-1  | 
| eu-south-2 |  Commercial AWS Regions eu-south-2  | 
| eu-west-1 |  Commercial AWS Regions eu-west-1  | 
| eu-west-2 |  Commercial AWS Regions eu-west-2  | 
| eu-west-3 |  Commercial AWS Regions eu-west-3  | 
| il-central-1 |  Commercial AWS Regions il-central-1  | 
| me-central-1 |  Commercial AWS Regions me-central-1  | 
| me-south-1 |  Commercial AWS Regions me-south-1  | 
| mx-central-1 |  Commercial AWS Regions mx-central-1  | 
| sa-east-1 |  Commercial AWS Regions sa-east-1  | 
| us-east-1 |  Commercial AWS Regions us-east-1  | 
| us-east-2 |  Commercial AWS Regions us-east-2  | 
| us-west-1 |  Commercial AWS Regions us-west-1  | 
| us-west-2 |  Commercial AWS Regions us-west-2  | 

### Global Anthropic Claude Haiku 4.5
<a name="cross-region-ip-global.anthropic.claude-haiku-4-5-20251001-v1:0"></a>

要调用 Global Anthropic Claude Haiku 4.5 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
global.anthropic.claude-haiku-4-5-20251001-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| af-south-1 |  Commercial AWS Regions af-south-1  | 
| ap-east-2 |  Commercial AWS Regions ap-east-2  | 
| ap-northeast-1 |  Commercial AWS Regions ap-northeast-1  | 
| ap-northeast-2 |  Commercial AWS Regions ap-northeast-2  | 
| ap-northeast-3 |  Commercial AWS Regions ap-northeast-3  | 
| ap-south-1 |  Commercial AWS Regions ap-south-1  | 
| ap-south-2 |  Commercial AWS Regions ap-south-2  | 
| ap-southeast-1 |  Commercial AWS Regions ap-southeast-1  | 
| ap-southeast-2 |  Commercial AWS Regions ap-southeast-2  | 
| ap-southeast-3 |  Commercial AWS Regions ap-southeast-3  | 
| ap-southeast-4 |  Commercial AWS Regions ap-southeast-4  | 
| ap-southeast-5 |  Commercial AWS Regions ap-southeast-5  | 
| ap-southeast-7 |  Commercial AWS Regions ap-southeast-7  | 
| ca-central-1 |  Commercial AWS Regions ca-central-1  | 
| ca-west-1 |  Commercial AWS Regions ca-west-1  | 
| eu-central-1 |  Commercial AWS Regions eu-central-1  | 
| eu-central-2 |  Commercial AWS Regions eu-central-2  | 
| eu-north-1 |  Commercial AWS Regions eu-north-1  | 
| eu-south-1 |  Commercial AWS Regions eu-south-1  | 
| eu-south-2 |  Commercial AWS Regions eu-south-2  | 
| eu-west-1 |  Commercial AWS Regions eu-west-1  | 
| eu-west-2 |  Commercial AWS Regions eu-west-2  | 
| eu-west-3 |  Commercial AWS Regions eu-west-3  | 
| il-central-1 |  Commercial AWS Regions il-central-1  | 
| me-central-1 |  Commercial AWS Regions me-central-1  | 
| me-south-1 |  Commercial AWS Regions me-south-1  | 
| mx-central-1 |  Commercial AWS Regions mx-central-1  | 
| sa-east-1 |  Commercial AWS Regions sa-east-1  | 
| us-east-1 |  Commercial AWS Regions us-east-1  | 
| us-east-2 |  Commercial AWS Regions us-east-2  | 
| us-west-1 |  Commercial AWS Regions us-west-1  | 
| us-west-2 |  Commercial AWS Regions us-west-2  | 

### 全球人类克劳德作品 4.6
<a name="cross-region-ip-global.anthropic.claude-opus-4-6-v1"></a>

要调用 Global Anthropic Claude Opus 4.6 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
global.anthropic.claude-opus-4-6-v1
```

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| af-south-1 |  Commercial AWS Regions af-south-1  | 
| ap-east-2 |  Commercial AWS Regions ap-east-2  | 
| ap-northeast-1 |  Commercial AWS Regions ap-northeast-1  | 
| ap-northeast-2 |  Commercial AWS Regions ap-northeast-2  | 
| ap-northeast-3 |  Commercial AWS Regions ap-northeast-3  | 
| ap-south-1 |  Commercial AWS Regions ap-south-1  | 
| ap-south-2 |  Commercial AWS Regions ap-south-2  | 
| ap-southeast-1 |  Commercial AWS Regions ap-southeast-1  | 
| ap-southeast-2 |  Commercial AWS Regions ap-southeast-2  | 
| ap-southeast-3 |  Commercial AWS Regions ap-southeast-3  | 
| ap-southeast-4 |  Commercial AWS Regions ap-southeast-4  | 
| ap-southeast-5 |  Commercial AWS Regions ap-southeast-5  | 
| ap-southeast-7 |  Commercial AWS Regions ap-southeast-7  | 
| ca-central-1 |  Commercial AWS Regions ca-central-1  | 
| ca-west-1 |  Commercial AWS Regions ca-west-1  | 
| eu-central-1 |  Commercial AWS Regions eu-central-1  | 
| eu-central-2 |  Commercial AWS Regions eu-central-2  | 
| eu-north-1 |  Commercial AWS Regions eu-north-1  | 
| eu-south-1 |  Commercial AWS Regions eu-south-1  | 
| eu-south-2 |  Commercial AWS Regions eu-south-2  | 
| eu-west-1 |  Commercial AWS Regions eu-west-1  | 
| eu-west-2 |  Commercial AWS Regions eu-west-2  | 
| eu-west-3 |  Commercial AWS Regions eu-west-3  | 
| il-central-1 |  Commercial AWS Regions il-central-1  | 
| me-central-1 |  Commercial AWS Regions me-central-1  | 
| me-south-1 |  Commercial AWS Regions me-south-1  | 
| mx-central-1 |  Commercial AWS Regions mx-central-1  | 
| sa-east-1 |  Commercial AWS Regions sa-east-1  | 
| us-east-1 |  Commercial AWS Regions us-east-1  | 
| us-east-2 |  Commercial AWS Regions us-east-2  | 
| us-west-1 |  Commercial AWS Regions us-west-1  | 
| us-west-2 |  Commercial AWS Regions us-west-2  | 

### 全球人类学克劳德十四行诗 4.6
<a name="cross-region-ip-global.anthropic.claude-sonnet-4-6"></a>

要调用 Global Anthropic Claude Sonnet 4.6 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
global.anthropic.claude-sonnet-4-6
```

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| af-south-1 |  Commercial AWS Regions af-south-1  | 
| ap-east-2 |  Commercial AWS Regions ap-east-2  | 
| ap-northeast-1 |  Commercial AWS Regions ap-northeast-1  | 
| ap-northeast-2 |  Commercial AWS Regions ap-northeast-2  | 
| ap-northeast-3 |  Commercial AWS Regions ap-northeast-3  | 
| ap-south-1 |  Commercial AWS Regions ap-south-1  | 
| ap-south-2 |  Commercial AWS Regions ap-south-2  | 
| ap-southeast-1 |  Commercial AWS Regions ap-southeast-1  | 
| ap-southeast-2 |  Commercial AWS Regions ap-southeast-2  | 
| ap-southeast-3 |  Commercial AWS Regions ap-southeast-3  | 
| ap-southeast-4 |  Commercial AWS Regions ap-southeast-4  | 
| ap-southeast-5 |  Commercial AWS Regions ap-southeast-5  | 
| ap-southeast-7 |  Commercial AWS Regions ap-southeast-7  | 
| ca-central-1 |  Commercial AWS Regions ca-central-1  | 
| ca-west-1 |  Commercial AWS Regions ca-west-1  | 
| eu-central-1 |  Commercial AWS Regions eu-central-1  | 
| eu-central-2 |  Commercial AWS Regions eu-central-2  | 
| eu-north-1 |  Commercial AWS Regions eu-north-1  | 
| eu-south-1 |  Commercial AWS Regions eu-south-1  | 
| eu-south-2 |  Commercial AWS Regions eu-south-2  | 
| eu-west-1 |  Commercial AWS Regions eu-west-1  | 
| eu-west-2 |  Commercial AWS Regions eu-west-2  | 
| eu-west-3 |  Commercial AWS Regions eu-west-3  | 
| il-central-1 |  Commercial AWS Regions il-central-1  | 
| me-central-1 |  Commercial AWS Regions me-central-1  | 
| me-south-1 |  Commercial AWS Regions me-south-1  | 
| mx-central-1 |  Commercial AWS Regions mx-central-1  | 
| sa-east-1 |  Commercial AWS Regions sa-east-1  | 
| us-east-1 |  Commercial AWS Regions us-east-1  | 
| us-east-2 |  Commercial AWS Regions us-east-2  | 
| us-west-1 |  Commercial AWS Regions us-west-1  | 
| us-west-2 |  Commercial AWS Regions us-west-2  | 

### Global Claude Sonnet 4
<a name="cross-region-ip-global.anthropic.claude-sonnet-4-20250514-v1:0"></a>

要调用 Global Claude Sonnet 4 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
global.anthropic.claude-sonnet-4-20250514-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-northeast-1 |  Commercial AWS Regions ap-northeast-1  | 
| eu-west-1 |  Commercial AWS Regions eu-west-1  | 
| us-east-1 |  Commercial AWS Regions us-east-1  | 
| us-east-2 |  Commercial AWS Regions us-east-2  | 
| us-west-2 |  Commercial AWS Regions us-west-2  | 

### Global Claude Sonnet 4.5
<a name="cross-region-ip-global.anthropic.claude-sonnet-4-5-20250929-v1:0"></a>

要调用 Global Claude Sonnet 4.5 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
global.anthropic.claude-sonnet-4-5-20250929-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| af-south-1 |  Commercial AWS Regions af-south-1  | 
| ap-east-2 |  Commercial AWS Regions ap-east-2  | 
| ap-northeast-1 |  Commercial AWS Regions ap-northeast-1  | 
| ap-northeast-2 |  Commercial AWS Regions ap-northeast-2  | 
| ap-northeast-3 |  Commercial AWS Regions ap-northeast-3  | 
| ap-south-1 |  Commercial AWS Regions ap-south-1  | 
| ap-south-2 |  Commercial AWS Regions ap-south-2  | 
| ap-southeast-1 |  Commercial AWS Regions ap-southeast-1  | 
| ap-southeast-2 |  Commercial AWS Regions ap-southeast-2  | 
| ap-southeast-3 |  Commercial AWS Regions ap-southeast-3  | 
| ap-southeast-4 |  Commercial AWS Regions ap-southeast-4  | 
| ap-southeast-5 |  Commercial AWS Regions ap-southeast-5  | 
| ap-southeast-7 |  Commercial AWS Regions ap-southeast-7  | 
| ca-central-1 |  Commercial AWS Regions ca-central-1  | 
| ca-west-1 |  Commercial AWS Regions ca-west-1  | 
| eu-central-1 |  Commercial AWS Regions eu-central-1  | 
| eu-central-2 |  Commercial AWS Regions eu-central-2  | 
| eu-north-1 |  Commercial AWS Regions eu-north-1  | 
| eu-south-1 |  Commercial AWS Regions eu-south-1  | 
| eu-south-2 |  Commercial AWS Regions eu-south-2  | 
| eu-west-1 |  Commercial AWS Regions eu-west-1  | 
| eu-west-2 |  Commercial AWS Regions eu-west-2  | 
| eu-west-3 |  Commercial AWS Regions eu-west-3  | 
| il-central-1 |  Commercial AWS Regions il-central-1  | 
| me-central-1 |  Commercial AWS Regions me-central-1  | 
| me-south-1 |  Commercial AWS Regions me-south-1  | 
| mx-central-1 |  Commercial AWS Regions mx-central-1  | 
| sa-east-1 |  Commercial AWS Regions sa-east-1  | 
| us-east-1 |  Commercial AWS Regions us-east-1  | 
| us-east-2 |  Commercial AWS Regions us-east-2  | 
| us-west-1 |  Commercial AWS Regions us-west-1  | 
| us-west-2 |  Commercial AWS Regions us-west-2  | 

### Global Cohere Embed v4
<a name="cross-region-ip-global.cohere.embed-v4:0"></a>

要调用 Global Cohere Embed v4 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
global.cohere.embed-v4:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-embed.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-northeast-1 |  Commercial AWS Regions ap-northeast-1  | 
| ap-northeast-2 |  Commercial AWS Regions ap-northeast-2  | 
| ap-northeast-3 |  Commercial AWS Regions ap-northeast-3  | 
| ap-south-1 |  Commercial AWS Regions ap-south-1  | 
| ap-south-2 |  Commercial AWS Regions ap-south-2  | 
| ap-southeast-1 |  Commercial AWS Regions ap-southeast-1  | 
| ap-southeast-2 |  Commercial AWS Regions ap-southeast-2  | 
| ap-southeast-3 |  Commercial AWS Regions ap-southeast-3  | 
| ap-southeast-4 |  Commercial AWS Regions ap-southeast-4  | 
| ca-central-1 |  Commercial AWS Regions ca-central-1  | 
| eu-central-1 |  Commercial AWS Regions eu-central-1  | 
| eu-central-2 |  Commercial AWS Regions eu-central-2  | 
| eu-north-1 |  Commercial AWS Regions eu-north-1  | 
| eu-south-1 |  Commercial AWS Regions eu-south-1  | 
| eu-south-2 |  Commercial AWS Regions eu-south-2  | 
| eu-west-1 |  Commercial AWS Regions eu-west-1  | 
| eu-west-2 |  Commercial AWS Regions eu-west-2  | 
| eu-west-3 |  Commercial AWS Regions eu-west-3  | 
| sa-east-1 |  Commercial AWS Regions sa-east-1  | 
| us-east-1 |  Commercial AWS Regions us-east-1  | 
| us-east-2 |  Commercial AWS Regions us-east-2  | 
| us-west-1 |  Commercial AWS Regions us-west-1  | 
| us-west-2 |  Commercial AWS Regions us-west-2  | 

### 美国亚马逊 Nova 2 Lite
<a name="cross-region-ip-us.amazon.nova-2-lite-v1:0"></a>

要调用美国 Amazon Nova 2 Lite 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.amazon.nova-2-lite-v1:0
```

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ca-central-1 |  ca-central-1 us-east-1 us-east-2 us-west-2  | 
| ca-west-1 |  ca-west-1 us-east-1 us-east-2 us-west-2  | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Anthropic Claude 3 Haiku
<a name="cross-region-ip-us.anthropic.claude-3-haiku-20240307-v1:0"></a>

要调用 US Anthropic Claude 3 Haiku 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.anthropic.claude-3-haiku-20240307-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-west-2  | 

### US Anthropic Claude 3 Opus
<a name="cross-region-ip-us.anthropic.claude-3-opus-20240229-v1:0"></a>

要调用 US Anthropic Claude 3 Opus 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.anthropic.claude-3-opus-20240229-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-west-2  | 
| us-west-2 |  us-east-1 us-west-2  | 

### US Anthropic Claude 3 Sonnet
<a name="cross-region-ip-us.anthropic.claude-3-sonnet-20240229-v1:0"></a>

要调用 US Anthropic Claude 3 Sonnet 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.anthropic.claude-3-sonnet-20240229-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-west-2  | 
| us-west-2 |  us-east-1 us-west-2  | 

### US Anthropic Claude 3.5 Haiku
<a name="cross-region-ip-us.anthropic.claude-3-5-haiku-20241022-v1:0"></a>

要调用 US Anthropic Claude 3.5 Haiku 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.anthropic.claude-3-5-haiku-20241022-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Anthropic Claude 3.5 Sonnet
<a name="cross-region-ip-us.anthropic.claude-3-5-sonnet-20240620-v1:0"></a>

要调用 US Anthropic Claude 3.5 Sonnet 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.anthropic.claude-3-5-sonnet-20240620-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-west-2  | 
| us-east-2 |  us-east-1 us-west-2  | 
| us-west-2 |  us-east-1 us-west-2  | 

### US Anthropic Claude 3.5 Sonnet v2
<a name="cross-region-ip-us.anthropic.claude-3-5-sonnet-20241022-v2:0"></a>

要调用 US Anthropic Claude 3.5 Sonnet v2 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.anthropic.claude-3-5-sonnet-20241022-v2:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Anthropic Claude 3.7 Sonnet
<a name="cross-region-ip-us.anthropic.claude-3-7-sonnet-20250219-v1:0"></a>

要调用 US Anthropic Claude 3.7 Sonnet 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.anthropic.claude-3-7-sonnet-20250219-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Anthropic Claude Haiku 4.5
<a name="cross-region-ip-us.anthropic.claude-haiku-4-5-20251001-v1:0"></a>

要调用 US Anthropic Claude Haiku 4.5 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.anthropic.claude-haiku-4-5-20251001-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ca-central-1 |  ca-central-1 us-east-1 us-east-2 us-west-2  | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### 美国 Anthropic Claude Opu
<a name="cross-region-ip-us.anthropic.claude-opus-4-5-20251101-v1:0"></a>

要调用 US Anthropic Claude Opus 4.5 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.anthropic.claude-opus-4-5-20251101-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ca-central-1 |  ca-central-1 us-east-1 us-east-2 us-west-2  | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### 美国人类学克劳德作品 4.6
<a name="cross-region-ip-us.anthropic.claude-opus-4-6-v1"></a>

要调用 US Anthropic Claude Opus 4.6 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.anthropic.claude-opus-4-6-v1
```

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ca-central-1 |  ca-central-1 us-east-1 us-east-2 us-west-2  | 
| ca-west-1 |  ca-west-1 us-east-1 us-east-2 us-west-2  | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Anthropic Claude Sonnet 4.5
<a name="cross-region-ip-us.anthropic.claude-sonnet-4-5-20250929-v1:0"></a>

要调用 US Anthropic Claude Sonnet 4.5 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.anthropic.claude-sonnet-4-5-20250929-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ca-central-1 |  ca-central-1 us-east-1 us-east-2 us-west-2  | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### 美国人类学克劳德十四行诗 4.6
<a name="cross-region-ip-us.anthropic.claude-sonnet-4-6"></a>

要调用 US Anthropic Claude Sonnet 4.6 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.anthropic.claude-sonnet-4-6
```

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ca-central-1 |  ca-central-1 us-east-1 us-east-2 us-west-2  | 
| ca-west-1 |  ca-west-1 us-east-1 us-east-2 us-west-2  | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Claude Opus 4
<a name="cross-region-ip-us.anthropic.claude-opus-4-20250514-v1:0"></a>

要调用 US Claude Opus 4 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.anthropic.claude-opus-4-20250514-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Claude Opus 4.1
<a name="cross-region-ip-us.anthropic.claude-opus-4-1-20250805-v1:0"></a>

要调用 US Claude Opus 4.1 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.anthropic.claude-opus-4-1-20250805-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Claude Sonnet 4
<a name="cross-region-ip-us.anthropic.claude-sonnet-4-20250514-v1:0"></a>

要调用 US Claude Sonnet 4 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.anthropic.claude-sonnet-4-20250514-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Cohere Embed v4
<a name="cross-region-ip-us.cohere.embed-v4:0"></a>

要调用 US Cohere Embed v4 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.cohere.embed-v4:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-embed.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### 美国 DeepSeek-R1
<a name="cross-region-ip-us.deepseek.r1-v1:0"></a>

要调用 US DeepSeek-R1 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.deepseek.r1-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](https://www.deepseek.com/)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Llama 4 Maverick 17B Instruct
<a name="cross-region-ip-us.meta.llama4-maverick-17b-instruct-v1:0"></a>

要调用 US Llama 4 Maverick 17B Instruct 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.meta.llama4-maverick-17b-instruct-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-meta.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Llama 4 Scout 17B Instruct
<a name="cross-region-ip-us.meta.llama4-scout-17b-instruct-v1:0"></a>

要调用 US Llama 4 Scout 17B Instruct 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.meta.llama4-scout-17b-instruct-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-meta.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Meta Llama 3.1 70B Instruct
<a name="cross-region-ip-us.meta.llama3-1-70b-instruct-v1:0"></a>

要调用 US Meta Llama 3.1 70B Instruct 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.meta.llama3-1-70b-instruct-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-meta.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Meta Llama 3.1 8B Instruct
<a name="cross-region-ip-us.meta.llama3-1-8b-instruct-v1:0"></a>

要调用 US Meta Llama 3.1 8B Instruct 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.meta.llama3-1-8b-instruct-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-meta.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Meta Llama 3.1 Instruct 405B
<a name="cross-region-ip-us.meta.llama3-1-405b-instruct-v1:0"></a>

要调用 US Meta Llama 3.1 Instruct 405B 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.meta.llama3-1-405b-instruct-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-meta.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 

### US Meta Llama 3.2 11B Instruct
<a name="cross-region-ip-us.meta.llama3-2-11b-instruct-v1:0"></a>

要调用 US Meta Llama 3.2 11B Instruct 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.meta.llama3-2-11b-instruct-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-meta.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-west-2  | 

### US Meta Llama 3.2 1B Instruct
<a name="cross-region-ip-us.meta.llama3-2-1b-instruct-v1:0"></a>

要调用 US Meta Llama 3.2 1B Instruct 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.meta.llama3-2-1b-instruct-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-meta.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-west-2  | 

### US Meta Llama 3.2 3B Instruct
<a name="cross-region-ip-us.meta.llama3-2-3b-instruct-v1:0"></a>

要调用 US Meta Llama 3.2 3B Instruct 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.meta.llama3-2-3b-instruct-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-meta.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-west-2  | 

### US Meta Llama 3.2 90B Instruct
<a name="cross-region-ip-us.meta.llama3-2-90b-instruct-v1:0"></a>

要调用 US Meta Llama 3.2 90B Instruct 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.meta.llama3-2-90b-instruct-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-meta.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-west-2  | 

### US Meta Llama 3.3 70B Instruct
<a name="cross-region-ip-us.meta.llama3-3-70b-instruct-v1:0"></a>

要调用 US Meta Llama 3.3 70B Instruct 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.meta.llama3-3-70b-instruct-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-meta.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Mistral Pixtral Large 25.02
<a name="cross-region-ip-us.mistral.pixtral-large-2502-v1:0"></a>

要调用 US Mistral Pixtral Large 25.02 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.mistral.pixtral-large-2502-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-mistral.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Nova Lite
<a name="cross-region-ip-us.amazon.nova-lite-v1:0"></a>

要调用 US Nova Lite 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.amazon.nova-lite-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Nova Micro
<a name="cross-region-ip-us.amazon.nova-micro-v1:0"></a>

要调用 US Nova Micro 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.amazon.nova-micro-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Nova Premier
<a name="cross-region-ip-us.amazon.nova-premier-v1:0"></a>

要调用 US Nova Premier 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.amazon.nova-premier-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Nova Pro
<a name="cross-region-ip-us.amazon.nova-pro-v1:0"></a>

要调用 US Nova Pro 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.amazon.nova-pro-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### 美国 Pegasus v1.2
<a name="cross-region-ip-us.twelvelabs.pegasus-1-2-v1:0"></a>

要调用 US Pegasus v1.2 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.twelvelabs.pegasus-1-2-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-pegasus.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### 美国稳定形象保守派高档
<a name="cross-region-ip-us.stability.stable-conservative-upscale-v1:0"></a>

要调用美国稳定图像保守高档推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.stability.stable-conservative-upscale-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](stable-image-services.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Stable Image Control Sketch
<a name="cross-region-ip-us.stability.stable-image-control-sketch-v1:0"></a>

要调用 US Stable Image Control Sketch 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.stability.stable-image-control-sketch-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-stability-diffusion.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Stable Image Control Structure
<a name="cross-region-ip-us.stability.stable-image-control-structure-v1:0"></a>

要调用 US Stable Image Control Structure 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.stability.stable-image-control-structure-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-stability-diffusion.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### 美国稳定形象创意高档
<a name="cross-region-ip-us.stability.stable-creative-upscale-v1:0"></a>

要调用 US Stable Image Creative Uscale 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.stability.stable-creative-upscale-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](stable-image-services.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Stable Image Erase Object
<a name="cross-region-ip-us.stability.stable-image-erase-object-v1:0"></a>

要调用 US Stable Image Erase Object 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.stability.stable-image-erase-object-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-stability-diffusion.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### 美国稳定图像快速升级
<a name="cross-region-ip-us.stability.stable-fast-upscale-v1:0"></a>

要调用 US Stable Image Fast Uscale 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.stability.stable-fast-upscale-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](stable-image-services.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Stable Image Inpaint
<a name="cross-region-ip-us.stability.stable-image-inpaint-v1:0"></a>

要调用 US Stable Image Inpaint 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.stability.stable-image-inpaint-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-stability-diffusion.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### 美国稳定图像出局
<a name="cross-region-ip-us.stability.stable-outpaint-v1:0"></a>

要调用 US Stable Image Outpaint 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.stability.stable-outpaint-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](stable-image-services.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Stable Image Remove Background
<a name="cross-region-ip-us.stability.stable-image-remove-background-v1:0"></a>

要调用 US Stable Image Remove Background 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.stability.stable-image-remove-background-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-stability-diffusion.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Stable Image Search and Recolor
<a name="cross-region-ip-us.stability.stable-image-search-recolor-v1:0"></a>

要调用 US Stable Image Search and Recolor 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.stability.stable-image-search-recolor-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-stability-diffusion.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Stable Image Search and Replace
<a name="cross-region-ip-us.stability.stable-image-search-replace-v1:0"></a>

要调用 US Stable Image Search and Replace 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.stability.stable-image-search-replace-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-stability-diffusion.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Stable Image Style Guide
<a name="cross-region-ip-us.stability.stable-image-style-guide-v1:0"></a>

要调用 US Stable Image Style Guide 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.stability.stable-image-style-guide-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-stability-diffusion.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US Stable Image Style Transfer
<a name="cross-region-ip-us.stability.stable-style-transfer-v1:0"></a>

要调用 US Stable Image Style Transfer 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.stability.stable-style-transfer-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-stability-diffusion.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### 美国 TwelveLabs Marengo Embed 3.0
<a name="cross-region-ip-us.twelvelabs.marengo-embed-3-0-v1:0"></a>

要调用美国 TwelveLabs Marengo Embed 3.0 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.twelvelabs.marengo-embed-3-0-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-marengo.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 

### 美国 TwelveLabs Marengo Embed v2.7
<a name="cross-region-ip-us.twelvelabs.marengo-embed-2-7-v1:0"></a>

要调用 US TwelveLabs Marengo Embed v2.7 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.twelvelabs.marengo-embed-2-7-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-marengo.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 

### 美国作家 Palmyra X4
<a name="cross-region-ip-us.writer.palmyra-x4-v1:0"></a>

要调用 US Writer Palmyra X4 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.writer.palmyra-x4-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-writer-palmyra.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### 美国作家 Palmyra X5
<a name="cross-region-ip-us.writer.palmyra-x5-v1:0"></a>

要调用 US Writer Palmyra X5 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us.writer.palmyra-x5-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-writer-palmyra.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-east-1 |  us-east-1 us-east-2 us-west-2  | 
| us-east-2 |  us-east-1 us-east-2 us-west-2  | 
| us-west-1 |  us-east-1 us-east-2 us-west-1 us-west-2  | 
| us-west-2 |  us-east-1 us-east-2 us-west-2  | 

### US-GOV Claude 3 Haiku
<a name="cross-region-ip-us-gov.anthropic.claude-3-haiku-20240307-v1:0"></a>

要调用 US-GOV Claude 3 Haiku 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us-gov.anthropic.claude-3-haiku-20240307-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-gov-east-1 |  us-gov-east-1 us-gov-west-1  | 

### US-GOV Claude 3.5 Sonnet
<a name="cross-region-ip-us-gov.anthropic.claude-3-5-sonnet-20240620-v1:0"></a>

要调用 US-GOV Claude 3.5 Sonnet 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us-gov.anthropic.claude-3-5-sonnet-20240620-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-gov-east-1 |  us-gov-east-1 us-gov-west-1  | 

### US-GOV Claude 3.7 Sonnet
<a name="cross-region-ip-us-gov.anthropic.claude-3-7-sonnet-20250219-v1:0"></a>

要调用 US-GOV Claude 3.7 Sonnet 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us-gov.anthropic.claude-3-7-sonnet-20250219-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-gov-east-1 |  us-gov-east-1 us-gov-west-1  | 

### 美国政府 Claude Sonnet 4.5
<a name="cross-region-ip-us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0"></a>

要调用 US-GOV Claude Sonnet 4.5 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| us-gov-east-1 |  us-gov-west-1  | 
| us-gov-west-1 |  us-gov-west-1  | 

### APAC Anthropic Claude 3 Haiku
<a name="cross-region-ip-apac.anthropic.claude-3-haiku-20240307-v1:0"></a>

要调用 APAC Anthropic Claude 3 Haiku 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
apac.anthropic.claude-3-haiku-20240307-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-south-1 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-1 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-2 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 

### APAC Anthropic Claude 3 Sonnet
<a name="cross-region-ip-apac.anthropic.claude-3-sonnet-20240229-v1:0"></a>

要调用 APAC Anthropic Claude 3 Sonnet 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
apac.anthropic.claude-3-sonnet-20240229-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-south-1 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-1 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-2 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 

### APAC Anthropic Claude 3.5 Sonnet
<a name="cross-region-ip-apac.anthropic.claude-3-5-sonnet-20240620-v1:0"></a>

要调用 APAC Anthropic Claude 3.5 Sonnet 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
apac.anthropic.claude-3-5-sonnet-20240620-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-south-1 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-1 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-2 |  ap-northeast-1 ap-northeast-2 ap-south-1 ap-southeast-1 ap-southeast-2  | 

### APAC Anthropic Claude 3.5 Sonnet v2
<a name="cross-region-ip-apac.anthropic.claude-3-5-sonnet-20241022-v2:0"></a>

要调用 APAC Anthropic Claude 3.5 Sonnet v2 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
apac.anthropic.claude-3-5-sonnet-20241022-v2:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-3 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-south-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-south-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 

### APAC Anthropic Claude 3.7 Sonnet
<a name="cross-region-ip-apac.anthropic.claude-3-7-sonnet-20250219-v1:0"></a>

要调用 APAC Anthropic Claude 3.7 Sonnet 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
apac.anthropic.claude-3-7-sonnet-20250219-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-3 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2  | 
| ap-south-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2  | 
| ap-south-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2  | 

### APAC Claude Sonnet 4
<a name="cross-region-ip-apac.anthropic.claude-sonnet-4-20250514-v1:0"></a>

要调用 APAC Claude Sonnet 4 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
apac.anthropic.claude-sonnet-4-20250514-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-east-2 |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4  | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-northeast-3 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-south-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-south-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-southeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-southeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-southeast-3 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4  | 
| ap-southeast-4 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-southeast-5 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5  | 
| ap-southeast-7 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-7  | 
| me-central-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 me-central-1  | 

### APAC Nova Lite
<a name="cross-region-ip-apac.amazon.nova-lite-v1:0"></a>

要调用 APAC Nova Lite 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
apac.amazon.nova-lite-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-east-2 |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4  | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-south-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-3 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4  | 
| ap-southeast-4 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-southeast-5 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5  | 
| ap-southeast-7 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-7  | 
| me-central-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 me-central-1  | 

### APAC Nova Micro
<a name="cross-region-ip-apac.amazon.nova-micro-v1:0"></a>

要调用 APAC Nova Micro 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
apac.amazon.nova-micro-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-east-2 |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4  | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-south-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-3 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4  | 
| ap-southeast-5 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5  | 
| ap-southeast-7 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-7  | 
| me-central-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 me-central-1  | 

### APAC Nova Pro
<a name="cross-region-ip-apac.amazon.nova-pro-v1:0"></a>

要调用 APAC Nova Pro 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
apac.amazon.nova-pro-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-east-2 |  ap-east-2 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4  | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-south-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-southeast-1 ap-southeast-2  | 
| ap-southeast-3 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4  | 
| ap-southeast-4 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 
| ap-southeast-5 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-5  | 
| ap-southeast-7 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 ap-southeast-7  | 
| me-central-1 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4 me-central-1  | 

### APAC Pegasus v1.2
<a name="cross-region-ip-apac.twelvelabs.pegasus-1-2-v1:0"></a>

要调用 APAC Pegasus v1.2 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
apac.twelvelabs.pegasus-1-2-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-pegasus.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-4  | 

### APAC TwelveLabs Marengo Embed v2.7
<a name="cross-region-ip-apac.twelvelabs.marengo-embed-2-7-v1:0"></a>

要调用 APAC TwelveLabs Marengo Embed v2.7 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
apac.twelvelabs.marengo-embed-2-7-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-marengo.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-northeast-2 |  ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-south-1 ap-south-2 ap-southeast-1 ap-southeast-2 ap-southeast-3 ap-southeast-4  | 

### AU AU Anthropic Claude Sonnet 4.5
<a name="cross-region-ip-au.anthropic.claude-sonnet-4-5-20250929-v1:0"></a>

要调用 AU AU Anthropic Claude Sonnet 4.5 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
au.anthropic.claude-sonnet-4-5-20250929-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-southeast-2 |  ap-southeast-2 ap-southeast-4  | 
| ap-southeast-4 |  ap-southeast-2 ap-southeast-4  | 

### AU Anthropic Claude Haiku 4.5
<a name="cross-region-ip-au.anthropic.claude-haiku-4-5-20251001-v1:0"></a>

要调用 AU Anthropic Claude Haiku 4.5 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
au.anthropic.claude-haiku-4-5-20251001-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-southeast-2 |  ap-southeast-2 ap-southeast-4  | 
| ap-southeast-4 |  ap-southeast-2 ap-southeast-4  | 

### AU Anthropic Claude Op
<a name="cross-region-ip-au.anthropic.claude-opus-4-6-v1"></a>

要调用 AU Anthropic Claude Opus 4.6 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
au.anthropic.claude-opus-4-6-v1
```

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-southeast-2 |  ap-southeast-2 ap-southeast-4  | 
| ap-southeast-4 |  ap-southeast-2 ap-southeast-4  | 

### AU Anthropic Claude Son
<a name="cross-region-ip-au.anthropic.claude-sonnet-4-6"></a>

要调用 AU Anthropic Claude Sonnet 4.6 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
au.anthropic.claude-sonnet-4-6
```

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-southeast-2 |  ap-southeast-2 ap-southeast-4  | 
| ap-southeast-4 |  ap-southeast-2 ap-southeast-4  | 

### CA Nova Lite
<a name="cross-region-ip-ca.amazon.nova-lite-v1:0"></a>

要调用 CA Nova Lite 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
ca.amazon.nova-lite-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ca-central-1 |  ca-central-1 ca-west-1  | 
| ca-west-1 |  ca-central-1 ca-west-1  | 

### 欧盟亚马逊 Nova 2 Lite
<a name="cross-region-ip-eu.amazon.nova-2-lite-v1:0"></a>

要调用欧盟 Amazon Nova 2 Lite 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.amazon.nova-2-lite-v1:0
```

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### EU Anthropic Claude 3 Haiku
<a name="cross-region-ip-eu.anthropic.claude-3-haiku-20240307-v1:0"></a>

要调用 EU Anthropic Claude 3 Haiku 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.anthropic.claude-3-haiku-20240307-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-west-1 eu-west-3  | 

### EU Anthropic Claude 3 Sonnet
<a name="cross-region-ip-eu.anthropic.claude-3-sonnet-20240229-v1:0"></a>

要调用 EU Anthropic Claude 3 Sonnet 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.anthropic.claude-3-sonnet-20240229-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-west-1 eu-west-3  | 

### EU Anthropic Claude 3.5 Sonnet
<a name="cross-region-ip-eu.anthropic.claude-3-5-sonnet-20240620-v1:0"></a>

要调用 EU Anthropic Claude 3.5 Sonnet 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.anthropic.claude-3-5-sonnet-20240620-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-west-1 eu-west-3  | 

### EU Anthropic Claude 3.7 Sonnet
<a name="cross-region-ip-eu.anthropic.claude-3-7-sonnet-20250219-v1:0"></a>

要调用 EU Anthropic Claude 3.7 Sonnet 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.anthropic.claude-3-7-sonnet-20250219-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 

### EU Anthropic Claude Haiku 4.5
<a name="cross-region-ip-eu.anthropic.claude-haiku-4-5-20251001-v1:0"></a>

要调用 EU Anthropic Claude Haiku 4.5 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.anthropic.claude-haiku-4-5-20251001-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-central-2 |  eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### 欧盟人类 Claude Opus 4.5
<a name="cross-region-ip-eu.anthropic.claude-opus-4-5-20251101-v1:0"></a>

要调用欧盟 Anthropic Claude Opus 4.5 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.anthropic.claude-opus-4-5-20251101-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-central-2 |  eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### 欧盟人类 Claude Opus 4.6
<a name="cross-region-ip-eu.anthropic.claude-opus-4-6-v1"></a>

要调用欧盟 Anthropic Claude Opus 4.6 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.anthropic.claude-opus-4-6-v1
```

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-central-2 |  eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### EU Anthropic Claude Sonnet 4.5
<a name="cross-region-ip-eu.anthropic.claude-sonnet-4-5-20250929-v1:0"></a>

要调用 EU Anthropic Claude Sonnet 4.5 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.anthropic.claude-sonnet-4-5-20250929-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-central-2 |  eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### 欧盟人类 Claude Sonnet 4.6
<a name="cross-region-ip-eu.anthropic.claude-sonnet-4-6"></a>

要调用 EU Anthropic Claude Sonnet 4.6 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.anthropic.claude-sonnet-4-6
```

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-central-2 |  eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### EU Claude Sonnet 4
<a name="cross-region-ip-eu.anthropic.claude-sonnet-4-20250514-v1:0"></a>

要调用 EU Claude Sonnet 4 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.anthropic.claude-sonnet-4-20250514-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| il-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3 il-central-1  | 

### EU Cohere Embed v4
<a name="cross-region-ip-eu.cohere.embed-v4:0"></a>

要调用 EU Cohere Embed v4 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.cohere.embed-v4:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-embed.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### EU Meta Llama 3.2 1B Instruct
<a name="cross-region-ip-eu.meta.llama3-2-1b-instruct-v1:0"></a>

要调用 EU Meta Llama 3.2 1B Instruct 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.meta.llama3-2-1b-instruct-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-meta.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-west-1 eu-west-3  | 

### EU Meta Llama 3.2 3B Instruct
<a name="cross-region-ip-eu.meta.llama3-2-3b-instruct-v1:0"></a>

要调用 EU Meta Llama 3.2 3B Instruct 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.meta.llama3-2-3b-instruct-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-meta.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-west-1 eu-west-3  | 

### EU Mistral Pixtral Large 25.02
<a name="cross-region-ip-eu.mistral.pixtral-large-2502-v1:0"></a>

要调用 EU Mistral Pixtral Large 25.02 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.mistral.pixtral-large-2502-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-mistral.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 

### EU Nova Lite
<a name="cross-region-ip-eu.amazon.nova-lite-v1:0"></a>

要调用 EU Nova Lite 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.amazon.nova-lite-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| il-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-3 il-central-1  | 

### EU Nova Micro
<a name="cross-region-ip-eu.amazon.nova-micro-v1:0"></a>

要调用 EU Nova Micro 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.amazon.nova-micro-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| il-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-3 il-central-1  | 

### EU Nova Pro
<a name="cross-region-ip-eu.amazon.nova-pro-v1:0"></a>

要调用 EU Nova Pro 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.amazon.nova-pro-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](https://docs.aws.amazon.com/nova/latest/userguide/getting-started-schema.html)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-west-1 eu-west-3  | 
| il-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-west-1 eu-west-3 il-central-1  | 

### 欧盟 TwelveLabs Marengo 嵌入 3.0
<a name="cross-region-ip-eu.twelvelabs.marengo-embed-3-0-v1:0"></a>

要调用 EU TwelveLabs Marengo Embed 3.0 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.twelvelabs.marengo-embed-3-0-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-marengo.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### 欧盟 TwelveLabs Marengo Embed v2.7
<a name="cross-region-ip-eu.twelvelabs.marengo-embed-2-7-v1:0"></a>

要调用 EU TwelveLabs Marengo Embed v2.7 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.twelvelabs.marengo-embed-2-7-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-marengo.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### 欧盟 TwelveLabs Pegasus v1.2
<a name="cross-region-ip-eu.twelvelabs.pegasus-1-2-v1:0"></a>

要调用 TwelveLabs EU Pegasus v1.2 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
eu.twelvelabs.pegasus-1-2-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-pegasus.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| eu-central-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-central-2 |  eu-central-1 eu-central-2 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-north-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-south-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-1 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 
| eu-west-2 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-2 eu-west-3  | 
| eu-west-3 |  eu-central-1 eu-north-1 eu-south-1 eu-south-2 eu-west-1 eu-west-3  | 

### JP 亚马逊 Nova 2 Lite
<a name="cross-region-ip-jp.amazon.nova-2-lite-v1:0"></a>

要调用 JP Amazon Nova 2 Lite 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
jp.amazon.nova-2-lite-v1:0
```

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-3  | 

### JP Anthropic Claude Haiku 4.5
<a name="cross-region-ip-jp.anthropic.claude-haiku-4-5-20251001-v1:0"></a>

要调用 JP Anthropic Claude Haiku 4.5 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
jp.anthropic.claude-haiku-4-5-20251001-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-3  | 
| ap-northeast-3 |  ap-northeast-1 ap-northeast-3  | 

### JP Anthropic Claude Sonnet 4.5
<a name="cross-region-ip-jp.anthropic.claude-sonnet-4-5-20250929-v1:0"></a>

要调用 JP Anthropic Claude Sonnet 4.5 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
jp.anthropic.claude-sonnet-4-5-20250929-v1:0
```

有关此模型的推理参数的更多信息，请参阅[链接](model-parameters-claude.md)。

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-3  | 
| ap-northeast-3 |  ap-northeast-1 ap-northeast-3  | 

### JP Anthropic Claude Son
<a name="cross-region-ip-jp.anthropic.claude-sonnet-4-6"></a>

要调用 JP Anthropic Claude Sonnet 4.6 推理配置文件，请在其中一个源区域中指定以下推理配置文件 ID：

```
jp.anthropic.claude-sonnet-4-6
```

下表显示了您可以从中调用推理配置文件的源区域，以及可以将请求路由到的目标区域：


| 来源区域 | 目标区域 | 
| --- | --- | 
| ap-northeast-1 |  ap-northeast-1 ap-northeast-3  | 
| ap-northeast-3 |  ap-northeast-1 ap-northeast-3  | 

## 支持应用程序推理配置文件的区域和模型
<a name="inference-profiles-support-user"></a>

可以为以下 AWS 区域所有模型创建应用程序推理配置文件：
+ ap-northeast-1
+ ap-northeast-2
+ ap-south-1
+ ap-southeast-1
+ ap-southeast-2
+ ca-central-1
+ eu-central-1
+ eu-west-1
+ eu-west-2
+ eu-west-3
+ sa-east-1
+ us-east-1
+ us-east-2
+ us-gov-east-1
+ us-west-2

可以从 Amazon Bedrock 支持的所有模型和推理配置文件中创建应用程序推理配置文件。有关 Amazon Bedrock 支持的模型的更多信息，请参阅 [Amazon Bedrock 中支持的根基模型](models-supported.md)。

# 推理配置文件的先决条件
<a name="inference-profiles-prereq"></a>

使用推理配置文件之前，请检查是否满足以下先决条件：
+ 您的角色有权执行推理配置文件 API 操作。如果您的角色[AmazonBedrockFullAccess](security-iam-awsmanpol.md#security-iam-awsmanpol-AmazonBedrockFullAccess)AWS附加了托管策略，则可以跳过此步骤。否则，请执行以下操作：

  1. 按照[创建 IAM 策略](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_create.html)中的步骤进行操作并创建以下策略，该策略允许角色使用所有基础模型和推理配置文件，执行与推理配置文件相关的操作并运行模型推理。

------
#### [ JSON ]

****  

     ```
     {
         "Version":"2012-10-17",		 	 	 
         "Statement": [
             {
                 "Effect": "Allow",
                 "Action": [
                     "bedrock:InvokeModel*",
                     "bedrock:CreateInferenceProfile"
                 ],
                 "Resource": [
                     "arn:aws:bedrock:*::foundation-model/*",
                     "arn:aws:bedrock:*:*:inference-profile/*",
                     "arn:aws:bedrock:*:*:application-inference-profile/*"
                 ]
             },
             {
                 "Effect": "Allow",
                 "Action": [
                     "bedrock:GetInferenceProfile",
                     "bedrock:ListInferenceProfiles",
                     "bedrock:DeleteInferenceProfile",
                     "bedrock:TagResource",
                     "bedrock:UntagResource",
                     "bedrock:ListTagsForResource"
                 ],
                 "Resource": [
                     "arn:aws:bedrock:*:*:inference-profile/*",
                     "arn:aws:bedrock:*:*:application-inference-profile/*"
                 ]
             }
         ]
     }
     ```

------

     （可选）您可以通过下列方式限制角色的访问权限：
     + 要限制角色可以执行的 API 操作，请修改 `Action` 字段中的列表，使其仅包含您要允许访问的 [API 操作](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-actions-as-permissions)。
     + 要限制角色对特定推理配置文件的访问权限，请修改 `Resource` 列表，使其仅包含您要允许访问的[推理配置文件](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-resources-for-iam-policies)和基础模型。系统定义的推理配置文件以 `inference-profile` 开头，应用程序推理配置文件以 `application-inference-profile` 开头。
**重要**  
在第一个语句的 `Resource` 字段中指定推理配置文件时，还必须在与其关联的每个区域中指定基础模型。
     + 要限制用户访问权限，使其只能通过推理配置文件调用基础模型，请添加 `Condition` 字段并使用 `aws:InferenceProfileArn` [条件键](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-policy-keys)。指定要筛选访问权限的推理配置文件。此条件可以包含在限定 `foundation-model` 资源范围的语句中。
     + 例如，您可以将以下策略附加到角色，使其只能通过 us-west-2 账户*111122223333*中的美国AnthropicClaude 3 Haiku推理配置文件调用AnthropicClaude 3 Haiku模型：

------
#### [ JSON ]

****  

       ```
       {
           "Version":"2012-10-17",		 	 	 
           "Statement": [
               {
                   "Effect": "Allow",
                   "Action": [
                       "bedrock:InvokeModel*"
                   ],
                   "Resource": [
                       "arn:aws:bedrock:us-west-2:111122223333:inference-profile/us.anthropic.claude-3-haiku-20240307-v1:0"
                   ]
               },
               {
                   "Effect": "Allow",
                   "Action": [
                       "bedrock:InvokeModel*"
                   ],
                   "Resource": [
                       "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-haiku-20240307-v1:0",
                       "arn:aws:bedrock:us-west-2::foundation-model/anthropic.claude-3-haiku-20240307-v1:0"
                   ],
                   "Condition": {
                       "StringLike": {
                           "bedrock:InferenceProfileArn": "arn:aws:bedrock:us-west-2:111122223333:inference-profile/us.anthropic.claude-3-haiku-20240307-v1:0"
                       }
                   }
               }
           ]
       }
       ```

------
     + 例如，您可以将以下策略附加到角色，使其只能在 us-east-2（美国东部（俄亥俄州），通过账户 111122223333 中全局 Claude Sonnet 4 推理配置文件调用 Anthropic Claude Sonnet 4 模型。

------
#### [ JSON ]

****  

       ```
       {
           "Version":"2012-10-17",		 	 	 
           "Statement": [
               {
                   "Effect": "Allow",
                   "Action": [
                       "bedrock:InvokeModel*"
                   ],
                   "Resource": [
                       "arn:aws:bedrock:us-east-2:111122223333:inference-profile/global.anthropic.claude-sonnet-4-20250514-v1:0"
                   ]
               },
               {
                   "Effect": "Allow",
                   "Action": [
                       "bedrock:InvokeModel*"
                   ],
                   "Resource": [
                       "arn:aws:bedrock:us-east-2::foundation-model/anthropic.claude-sonnet-4-20250514-v1:0",
                       "arn:aws:bedrock:::foundation-model/anthropic.claude-sonnet-4-20250514-v1:0"
                   ],
                   "Condition": {
                       "StringLike": {
                           "bedrock:InferenceProfileArn": "arn:aws:bedrock:us-east-2:111122223333:inference-profile/global.anthropic.claude-sonnet-4-20250514-v1:0"
                       }
                   }
               }
           ]
       }
       ```

------
     + 您还可以通过添加明确的 Deny 操作来限制全局 Claude Sonnet 4 推理配置文件的使用，其 `StringEquals` 条件用于检查请求上下文密钥 `aws:RequestedRegion` 是否等于 unspecified。由于匹配了 `StringEquals`，因此 Deny 操作会覆盖任何 Allow，并阻止推理请求的全局路由。

       ```
       {
           "Effect": "Deny",
           "Action": [
               "bedrock:InvokeModel*"
           ],
           "Resource": "*",
           "Condition": {
               "StringEquals": {
                   "aws:RequestedRegion": "unspecified"
               }
           }
       },
       ```

  1. 按照[添加和移除 IAM 身份权限](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html)中的步骤将策略附加到某个角色，以授予该角色查看和使用所有推理配置文件的权限。
+ 您已请求获取要使用的推理配置文件中所定义的模型，以及要在其中调用推理配置文件的区域的访问权限。

# 创建应用程序推理配置文件
<a name="inference-profiles-create"></a>

您可以创建包含一个或多个区域的应用程序推理配置文件，以追踪调用模型时的使用情况和成本。
+ 要为一个区域创建应用程序推理配置文件，请指定基础模型。将追踪使用该模型向这一地区发出的请求的使用情况和成本。
+ 要为多个区域创建应用程序推理配置文件，请指定跨区域（系统定义的）推理配置文件。推理配置文件会将请求路由至您选择的跨区域（系统定义的）推理配置文件中所定义的区域。将追踪向推理配置文件中的多个区域发出的请求的使用情况和成本。

目前，您只能使用 Amazon Bedrock API 创建推理配置文件。

要创建推理配置文件，请使用 [Amazon Bedrock 控制平面](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)终端节点发送[CreateInferenceProfile](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateInferenceProfile.html)请求。

以下字段是必填字段：


****  

| 字段 | 使用案例 | 
| --- | --- | 
| inferenceProfileName | 指定推理配置文件的名称。 | 
| modelSource | 指定基础模型或跨区域（系统定义的）推理配置文件，以定义模型以及要追踪其成本和使用情况的区域。 | 

以下字段是可选字段：


****  

| 字段 | 使用案例 | 
| --- | --- | 
| 描述 | 为推理配置文件提供描述。 | 
| 标签 | 为推理配置文件附加标签。有关更多信息，请参阅[使用成本分配标签[标记 Amazon Bedrock 资源](tagging.md)组织和跟踪AWS成本](https://docs.aws.amazon.com//awsaccountbilling/latest/aboutv2/cost-alloc-tags.html)。 | 
| clientRequestToken | 确保 API 请求仅完成一次。有关更多信息，请参阅[确保幂等性](https://docs.aws.amazon.com/ec2/latest/devguide/ec2-api-idempotency.html)。 | 

响应返回的 `inferenceProfileArn` 可用在与推理配置文件相关的其他操作中，也可以与模型调用和 Amazon Bedrock 资源一起使用。

# 修改应用程序推理配置文件的标签
<a name="inference-profiles-modify"></a>

创建应用程序推理配置文件后，您仍可通过 Amazon Bedrock API 管理标签，方法是使用 [Amazon Bedrock 控制面板端点](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)提交 [TagResource](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_TagResource.html) 或 [UntagResource](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_UntagResource.html) 请求，并在 `resourceArn` 字段中指定应用程序推理配置文件的 ARN。要了解有关标记的更多信息，请参阅[标记 Amazon Bedrock 资源](tagging.md)。

# 查看有关推理配置文件的信息
<a name="inference-profiles-view"></a>

您可以查看有关跨区域推理配置文件或您创建的应用程序推理配置文件的信息。要了解如何查看有关推理配置文件的信息，请选择与您的首选方法对应的选项卡，然后按照以下步骤操作：

------
#### [ Console ]

**查看有关跨区域（系统定义的）推理配置文件的信息**

1. 采用有权使用 Amazon Bedrock 控制台的 IAM 身份登录 AWS 管理控制台。然后，通过以下网址打开 Amazon Bedrock 控制台：[https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock)。

1. 从左侧导航窗格中选择**跨区域推理**。然后，在**跨区域推理**部分中，选择一个推理配置文件。

1. 在**推理配置文件详细信息**部分中查看推理配置文件的详细信息，并在**模型**部分查看推理配置文件所包含的区域。

**注意**  
您无法在 Amazon Bedrock 控制台中查看应用程序推理配置文件。

------
#### [ API ]

要获取有关推理配置文件的信息，请使用 [Amazon Bedrock 控制面板端点](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)发送 [GetInferenceProfile](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetInferenceProfile.html) 请求，并在 `inferenceProfileIdentifier` 字段中指定推理配置文件的 Amazon 资源名称（ARN）或 ID。

要列出有关可使用的推理配置文件的信息，请使用 [Amazon Bedrock 控制面板端点](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)发送 [ListInferenceProfiles](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListInferenceProfiles.html) 请求。您可以指定以下可选参数：


****  

| 字段 | 简短描述 | 
| --- | --- | 
| maxResults | 要在响应中返回的结果数量上限。 | 
| nextToken | 如果结果的数量多于您在 maxResults 字段中指定的数量，响应会返回一个 nextToken 值。要查看下一批结果，请在另一个请求中发送 nextToken 值。 | 

------

# 在模型调用中使用推理配置文件
<a name="inference-profiles-use"></a>

您可以使用跨区域推理配置文件代替基础模型，将请求路由到多个区域。要在一个或多个区域中追踪模型的成本和使用情况，您可以使用应用程序推理配置文件。要了解如何在运行模型推理时使用推理配置文件，请选择与您的首选方法对应的选项卡，然后按照以下步骤操作：

------
#### [ Console ]

要将推理配置文件与支持推理配置文件的功能结合使用，请执行以下操作：

1. 采用有权使用 Amazon Bedrock 控制台的 IAM 身份登录 AWS 管理控制台。然后，通过以下网址打开 Amazon Bedrock 控制台：[https://console.aws.amazon.com/bedrock](https://console.aws.amazon.com/bedrock)。

1. 导航到要使用推理配置文件的功能的页面。例如，从左侧导航窗格中选择**聊天/文本演练场**。

1. 选择**选择模型**，然后选择相应模型。例如，选择 **Amazon**，然后选择 **Nova Premier**。

1. 在**推理**下，从下拉菜单中选择**推理配置文件**。

1. 选择要使用的推理配置文件（例如，**US Nova Premier**），然后选择**应用**。

------
#### [ API ]

通过以下 API 操作，从推理配置文件中包含的任何区域运行推理时，您可以使用该推理配置文件：
+ [InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html) 或 [InvokeModelWithResponseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html) – 要在模型调用中使用推理配置文件，请按照[使用以下命令提交单个提示 InvokeModel](inference-invoke.md)中的步骤操作，并在 `modelId` 字段中指定推理配置文件的 Amazon 资源名称（ARN）。有关示例，请参阅[在模型调用中使用推理配置文件](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html#API_runtime_InvokeModel_Example_5)。
+ [Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) 或 [ConverseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ConverseStream.html) – 要通过 Converse API 在模型调用中使用推理配置文件，请按照[使用 Converse API 操作进行对话](conversation-inference.md)中的步骤操作，并在 `modelId` 字段中指定推理配置文件的 ARN。有关示例，请参阅[在模型调用中使用推理配置文件](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html#API_runtime_Converse_Example_5)。
+ [https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) – 要在根据查询知识库的结果生成响应时使用推理配置文件，请按照[利用查询和响应测试知识库](knowledge-base-test.md)中 API 选项卡中的步骤操作，并在 `modelArn` 字段中指定推理配置文件的 ARN。有关更多信息，请参阅[使用推理配置文件生成响应](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html#API_agent-runtime_RetrieveAndGenerate_Example_3)。
+ [CreateEvaluationJob](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateEvaluationJob.html) – 要提交模型评测的推理配置文件，请按照[在 Amazon Bedrock 中启动自动模型评测作业](model-evaluation-jobs-management-create.md)中 API 选项卡中的步骤操作，并在 `modelIdentifier` 字段中指定推理配置文件的 ARN。
+ [CreatePrompt](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_CreatePrompt.html) – 要在为提示管理中创建的提示生成响应时使用推理配置文件，请按照[使用提示管理器创建提示](prompt-management-create.md)中 API 选项卡中的步骤操作，并在 `modelId` 字段中指定推理配置文件的 ARN。
+ [CreateFlow](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_CreateFlow.html) – 要在为在流的提示节点内定义的内联提示生成响应时使用推理配置文件，请按照[在 Amazon Bedrock 中创建和设计流](flows-create.md)中 API 选项卡中的步骤进行操作。在定义[提示节点](flows-nodes.md#flows-nodes-prompt)时，在 `modelId` 字段中指定推理配置文件的 ARN。
+ [CreateDataSource](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_CreateDataSource.html) – 要在解析数据来源中的非文本信息时使用推理配置文件，请按照[数据来源的解析选项](kb-advanced-parsing.md)中 API 部分中的步骤操作，并在 `modelArn` 字段中指定推理配置文件的 ARN。

**注意**  
如果您使用的是跨区域（系统定义的）推理配置文件，则可使用推理配置文件的 ARN 或 ID。

------

# 删除应用程序推理配置文件
<a name="inference-profiles-delete"></a>

如果您不再需要某个应用程序推理配置文件，则可删除它。您只能通过 Amazon Bedrock API 删除推理配置文件。

要删除推理配置文件，请使用 [Amazon Bedrock 控制面板端点](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)发送 [DeleteInferenceProfile](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_DeleteInferenceProfiles.html) 请求，并在 `inferenceProflieIdentifier` 字段中指定要删除的推理配置文件的 Amazon 资源名称（ARN）或 ID。

# 利用 Amazon Bedrock 中的预调配吞吐量增加模型调用容量
<a name="prov-throughput"></a>

**吞吐量**是指模型处理和返回的输入和输出的数量和速率。您可以购买**预调配吞吐量**，以固定成本为模型预调配更高级别的吞吐量。如果您自定义了一个模型，则必须购买预调配吞吐量才能使用它。

您购买的预调配吞吐量按小时计费。有关定价的详细信息，请参阅 [Amazon Bedrock Pricing](https://aws.amazon.com/bedrock/pricing)。每小时的价格取决于以下因素：

1. 您选择的模型（对于自定义模型，其定价与自定义所基于的基础模型相同）。

1. 您为预配置吞吐量指定的模型单位数 (MUs)。MU 为指定模型提供了特定的吞吐量级别。MU 的吞吐量级别指定了以下各项：
   + MU 在一分钟内可以针对所有请求处理的输入词元数量。
   + MU 在一分钟内可以针对所有请求生成的输出词元数量。
**注意**  
有关 MU 指定的内容、每个 MU 的定价以及申请提高限额的更多信息，请联系您的 AWS 账户 经理。

1. 您承诺保持预调配吞吐量的时长。承诺期限越长，每小时价格的折扣就越大。您可以从以下承诺级别中进行选择：
   + 无承诺 – 您可以随时删除预调配吞吐量。
   + 1 个月 – 在一个月的承诺期限结束之前，您无法删除预调配吞吐量。
   + 6 个月 – 在六个月的承诺期限结束之前，您无法删除预调配吞吐量。
**注意**  
计费将持续到您删除预调配吞吐量为止。

以下步骤概述了设置和使用预调配吞吐量的过程。

1. 确定 MUs 您要为预配置吞吐量购买的数量，以及您要承诺使用预配置吞吐量的时间。

1. 为基础模型或自定义模型购买预调配吞吐量。

1. 创建预调配模型后，您可以使用它来[运行模型推理](inference.md)。

**Topics**
+ [支持预调配吞吐量的区域和模型](prov-thru-supported.md)
+ [预调配吞吐量的先决条件](prov-thru-prereq.md)
+ [为 Amazon Bedrock 模型购买预调配吞吐量](prov-thru-purchase.md)
+ [查看预调配吞吐量的相关信息](prov-thru-info.md)
+ [修改预调配吞吐量](prov-thru-edit.md)
+ [将预调配吞吐量与 Amazon Bedrock 资源结合使用](prov-thru-use.md)
+ [删除预配置吞吐量或取消自动续订](prov-thru-delete.md)
+ [预调配吞吐量的代码示例](prov-thru-code-examples.md)

# 支持预调配吞吐量的区域和模型
<a name="prov-thru-supported"></a>

如果您通过 Amazon Bedrock API 购买预配置吞吐量，则必须为型号 ID 指定 Amazon Bedrock FMs 的上下文变体。

**注意**  
 AWS GovCloud （美国西部）仅支持无需承诺购买的自定义型号，才支持预配置吞吐量。为自定义模型购买预调配吞吐量时，请使用该模型的 ID。

下表显示了您可以购买预置吞吐量的模型、购买预配置吞吐量时使用的模型 ID 以及您可以为该模型购买预置吞吐量的模型 ID。 AWS 区域 


| Provider | 模型 | 模型 ID | 支持单区域模型 | 
| --- | --- | --- | --- | 
| Amazon | 新星 2 精简版 | amazon.nova-2-lite-v 1:0:256 k |  us-east-1  | 
| Amazon | Nova Canvas | 亚马逊。 nova-canvas-v1:0 |  us-east-1  | 
| Amazon | Nova Lite | 亚马逊。 nova-lite-v1:0:24 k |  us-east-1  | 
| Amazon | Nova Lite | 亚马逊。 nova-lite-v1:0300 k |  us-east-1  | 
| Amazon | Nova Micro | 亚马逊。 nova-micro-v1:0:128 k |  us-east-1  | 
| Amazon | Nova Micro | 亚马逊。 nova-micro-v1:0:24 k |  us-east-1  | 
| Amazon | Nova Pro | 亚马逊。 nova-pro-v1:0:24 k |  us-east-1  | 
| Amazon | Nova Pro | 亚马逊。 nova-pro-v1:0300 k |  us-east-1  | 
| Amazon | Titan Embeddings G1 - Text | 亚马逊。 titan-embed-text-v1:2:8 k |  us-east-1 us-west-2  | 
| Amazon | Titan Image Generator G1 v2 | 亚马逊。 titan-image-generator-v2:0 |  us-east-1 us-west-2  | 
| Amazon | Titan Multimodal Embeddings G1 | 亚马逊。 titan-embed-image-v1:0 |  ap-south-1 ap-southeast-2 ca-central-1 eu-central-1 eu-west-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-west-2  | 
| Anthropic | Claude | anthropic.claude-v2:0:100k |  us-east-1 us-west-2  | 
| Anthropic | Claude | anthropic.claude-v2:0:18k |  us-east-1 us-west-2  | 
| Anthropic | Claude | anthropic.claude-v2:1:18k |  eu-central-1 us-east-1 us-west-2  | 
| Anthropic | Claude | anthropic.claude-v 2:1:200k |  eu-central-1 us-east-1 us-west-2  | 
| Anthropic | Claude 3 Haiku | anthropic.claude-3-haiku-20240307-v1:0:200k |  ap-southeast-2 eu-west-3 us-east-1 us-west-2  | 
| Anthropic | Claude 3 Haiku | anthropic.claude-3-haiku-20240307-v 1:0:48 k |  ap-south-1 ap-southeast-2 eu-west-1 eu-west-3 us-east-1 us-west-2  | 
| Anthropic | Claude 3 Sonnet | anthropic.claude-3-sonnet-20240229-v 1:0:200 k |  ap-southeast-2 eu-west-1 eu-west-3 us-east-1 us-west-2  | 
| Anthropic | Claude 3 Sonnet | anthropic.claude-3-sonnet-20240229-v 1:0:28 k |  ap-south-1 ap-southeast-2 eu-west-1 eu-west-3 us-east-1 us-west-2  | 
| Anthropic | Claude 3.5 Sonnet | anthropic.claude-3-5-sonnet-20240620-v 1:0:18 k |  us-west-2  | 
| Anthropic | Claude 3.5 Sonnet | anthropic.claude-3-5-sonnet-20240620-v 1:0:200 k |  us-west-2  | 
| Anthropic | Claude 3.5 Sonnet | anthropic.claude-3-5-sonnet-20240620-v 1:0:51 k |  us-west-2  | 
| Anthropic | Claude 3.5 Sonnet v2 | anthropic.claude-3-5-sonnet-20241022-v 2:0:18 k |  us-west-2  | 
| Anthropic | Claude 3.5 Sonnet v2 | anthropic.claude-3-5-sonnet-20241022-v 2:0:200 k |  us-west-2  | 
| Anthropic | Claude 3.5 Sonnet v2 | anthropic.claude-3-5-sonnet-20241022-v 2:0:51 k |  us-west-2  | 
| Anthropic | Claude Instant | 人类。 claude-instant-v1:2:100 k |  us-east-1 us-west-2  | 
| Cohere | Embed English | 一致。 embed-english-v3:0:512 |  ca-central-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-west-2  | 
| Cohere | Embed Multilingual | 一致。 embed-multilingual-v3:0:512 |  ca-central-1 eu-west-2 eu-west-3 sa-east-1 us-east-1 us-west-2  | 
| Meta | Llama 3.1 70B Instruct | meta.llama3-1-70 1:0:128 k b-instruct-v |  us-west-2  | 
| Meta | Llama 3.1 8B Instruct | meta.llama3-1-8 1:0:128 k b-instruct-v |  us-west-2  | 
| Meta | Llama 3.2 11B Instruct | meta.llama3-2-11 1:0:128 k b-instruct-v |  us-west-2  | 
| Meta | Llama 3.2 1B Instruct | meta.llama3-2-1 1:0:128 k b-instruct-v |  us-west-2  | 
| Meta | Llama 3.2 3B Instruct | meta.llama3-2-3 1:0:128 k b-instruct-v |  us-west-2  | 
| Meta | Llama 3.2 90B Instruct | meta.llama3-2-90 1:0:128 k b-instruct-v |  us-west-2  | 

**注意**  
以下型号不支持基本型号的无承诺购买：  
Titan Image Generator G1 V1
Titan Image Generator G1 V2

# 预调配吞吐量的先决条件
<a name="prov-thru-prereq"></a>

在购买和管理预配置吞吐量之前，您需要满足以下先决条件：

1. [请求访问](model-access.md)您要为其购买预调配吞吐量的一个或多个模型。获得访问权限后，您就可以为基础模型以及基于它自定义的任何模型购买预调配吞吐量。

1. 确保您的 IAM 角色有权访问预调配吞吐量 API 操作。如果您的角色[AmazonBedrockFullAccess](security-iam-awsmanpol.md#security-iam-awsmanpol-AmazonBedrockFullAccess)AWS附加了托管策略，则可以跳过此步骤。否则，请执行以下操作：

   1. 按照[创建 IAM 策略](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_create.html)中的步骤操作并创建以下策略，该策略允许角色为所有基础模型和自定义模型创建预调配吞吐量。

------
#### [ JSON ]

****  

      ```
      {
          "Version":"2012-10-17",		 	 	 
          "Statement": [
              {
                  "Sid": "PermissionsForProvisionedThroughput",
                  "Effect": "Allow",
                  "Action": [
                      "bedrock:GetFoundationModel",
                      "bedrock:ListFoundationModels",
                      "bedrock:GetCustomModel",
                      "bedrock:ListCustomModels",
                      "bedrock:InvokeModel",
                      "bedrock:InvokeModelWithResponseStream",
                      "bedrock:ListTagsForResource",
                      "bedrock:UntagResource",
                      "bedrock:TagResource",
                      "bedrock:CreateProvisionedModelThroughput",
                      "bedrock:GetProvisionedModelThroughput",
                      "bedrock:ListProvisionedModelThroughputs",
                      "bedrock:UpdateProvisionedModelThroughput",
                      "bedrock:DeleteProvisionedModelThroughput"
                  ],
                  "Resource": "*"
              }
          ]
      }
      ```

------
**注意**  
如果您使用预配置吞吐量进行跨区域推理，则可能需要额外的权限。请参阅 [通过跨区域推理提高吞吐量](cross-region-inference.md)，了解更多信息。

      （可选）您可以通过下列方式限制角色的访问权限：
      + 要限制角色可以执行的 API 操作，请修改 `Action` 字段中的列表，使其仅包含您要允许访问的 [API 操作](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-actions-as-permissions)。
      + 创建预调配模型后，您可以修改 `Resource` 列表，使其仅包含您要允许访问的[预调配模型](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-resources-for-iam-policies)，从而限制角色对预调配模型执行 API 请求的能力。有关示例，请参阅[允许用户调用预置模型](security_iam_id-based-policy-examples.md#security_iam_id-based-policy-examples-perform-actions-pt)。
      + 要限制角色从特定基础模型或自定义模型创建预调配模型的能力，请修改 `Resource` 列表，使其仅包含您要允许访问的[基础模型和自定义模型](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html#amazonbedrock-resources-for-iam-policies)。

   1. 按照[添加和移除 IAM 身份权限](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html)中的步骤将策略附加到某个角色，为该角色授予权限。

1. 如果您为使用客户管理AWS KMS密钥加密的自定义模型购买预配置吞吐量，则您的 IAM 角色必须具有解密密钥的权限。您可以使用 [了解如何创建客户自主管理型密钥以及如何为其附加密钥策略](encryption-custom-job.md#encryption-key-policy) 中的模板。对于最低权限，您只能使用*Permissions for custom model users*策略声明。

# 为 Amazon Bedrock 模型购买预调配吞吐量
<a name="prov-thru-purchase"></a>

Amazon Bedrock 提供两种类型的预配置吞吐量——按令牌和按模型单位划分。有关您要购买的预置吞吐量类型，请参阅以下说明。

要详细了解两种类型的预配置吞吐量之间的区别，请参阅[利用 Amazon Bedrock 中的预调配吞吐量增加模型调用容量](prov-throughput.md)。

## 按模型单位划分的预配置吞吐量
<a name="prov-thru-purchase-MUs"></a>

在为模型购买按模型单位计算的预配置吞吐量时，需要指定其承诺级别和要分配的模型单位数量 (MUs)。有关 MU 配额，请参阅 AWS 一般参考中的 [Amazon Bedrock endpoints and quotas](https://docs.aws.amazon.com/general/latest/gr/bedrock.html)。在购买预配置吞吐量（有承诺或无承诺）之前，您必须先访问[AWS支持中心](https://console.aws.amazon.com/support/home#/case/create?issueType=service-limit-increase)， MUs 申请在预配置吞吐量之间分配您的账户。您的请求获得批准后，您可以购买预配置吞吐量。

**注意**  
购买预配置吞吐量后，如果它与自定义模型相关联，则可以通过指定以下选项之一来更改模型：  
定制模型所依据的基本模型
另一个自定义模型是根据与自定义模型相同的基础模型定制的
您只能更改与自定义模型关联的预配置吞吐量的关联模型。

要了解如何为模型购买预置吞吐量，请选择首选方法的选项卡，然后按照以下步骤操作：

------
#### [ Console ]

1. 使用有权使用 Amazon Bedrock 控制台的 IAM 身份登录。AWS 管理控制台然后，在 [https://console.aws.amazon.com/](https://console.aws.amazon.com/bedrock)bedrock 上打开 Amazon Bedrock 控制台。

1. 从左侧导航窗格中，选择**预调配吞吐量**。

1. 在**预调配吞吐量**部分，选择**购买预调配吞吐量**。

1. 在**预调配吞吐量详细信息**部分，执行以下操作：

   1. 在**预调配吞吐量名称**字段中，输入预调配吞吐量的名称。

   1. 在**选择模型**下，选择基础模型提供商或自定义模型类别。然后选择要为其预调配吞吐量的模型。
**注意**  
要查看无需承诺即可购买预配置吞吐量的基本型号，请参阅支持的机型文档。  
在该AWS GovCloud (US)区域，您只能为自定义模型购买预置吞吐量，无需承诺。

   1. （可选）要将标签与您的预调配吞吐量关联，请展开**标签**部分并选择**添加新标签**。有关更多信息，请参阅 [标记 Amazon Bedrock 资源](tagging.md)。

1. 对于**置备模式**，请选择**按模型单位**

1. 在**承诺期限和模型单位**部分，执行以下操作：

   1. 在**选择承诺期限**部分，选择要承诺使用预调配吞吐量的时长。

   1. 在**模型单位**字段中，输入所需的模型单位数 (MUs)。如果您要使用承诺预配型号，则必须先访问[AWS支持中心](https://console.aws.amazon.com/support/home#/case/create?issueType=service-limit-increase)，申请增加可购买 MUs 的数量。

1. 选择**购买预调配吞吐量**。

1. 查看所显示的备注，并通过选中复选框确认承诺期限和价格。然后选择**确认购买**。

1. 此时控制台会显示**预调配吞吐量**概览页面。“预调配吞吐量”表中的预调配吞吐量的**状态**将变为**正在创建**。预调配吞吐量创建完毕后，**状态**将变为**服务中**。如果更新失败，**状态**将变为**失败**。

------
#### [ API ]

要购买预配置吞吐量，请使用 [Amazon Bedrock 控制平面](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)终端节点发送[CreateProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateProvisionedModelThroughput.html)请求。

要详细了解请求正文的内容以及创建按模型单位划分的预配置吞吐量时需要提供的参数，请参阅 *Amazon Bedrock API* 参考[CreateProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateProvisionedModelThroughput.html)中的。

**注意**  
要查看无需承诺即可购买预配置吞吐量的基本型号，请参阅支持的机型文档。  
在该AWS GovCloud (US)区域，您只能为自定义模型购买预置吞吐量，无需承诺。

响应会返回一个可用作[模型推理](inference.md)中的 `modelId` 的 `provisionedModelArn`。要检查预配置吞吐量何时可供使用，请发送[GetProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetProvisionedModelThroughput.html)请求并检查状态是否为`InService`。如果更新失败，则其状态将为`Failed`，[GetProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetProvisionedModelThroughput.html)响应中将包含`failureMessage`。

[参阅代码示例](prov-thru-code-examples.md)

------

# 查看预调配吞吐量的相关信息
<a name="prov-thru-info"></a>

要了解如何查看有关您已购买的预配置吞吐量的信息，请选择首选方法的选项卡，然后按照以下步骤操作：

------
#### [ Console ]

**查看预调配吞吐量的相关信息**

1. 使用有权使用 Amazon Bedrock 控制台的 IAM 身份登录。 AWS 管理控制台 然后，在 [https://console.aws.amazon.com/](https://console.aws.amazon.com/bedrock)bedrock 上打开 Amazon Bedrock 控制台。

1. 从左侧导航窗格中选择**预配置吞吐量**。

1. 从**预调配吞吐量**部分中，选择一个预调配吞吐量。

1. 在**预调配吞吐量概览**部分查看预调配吞吐量的详细信息，在**标签**部分查看与预调配吞吐量关联的标签。

------
#### [ API ]

要检索有关特定预配置吞吐量的信息，请使用 [Amazon Bedrock 控制平面](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)终端节点发送[GetProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetProvisionedModelThroughput.html)请求。指定预调配吞吐量的名称或其 ARN 作为 `provisionedModelId`。

要列出有关账户中所有预配置吞吐量的信息，[ListProvisionedModelThroughputs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListProvisionedModelThroughputs.html)请使用 A [mazon Bedrock 控制](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)平面终端节点发送请求。要控制返回的结果数量，您可以指定以下可选参数：


****  

| 字段 | 简短描述 | 
| --- | --- | 
| maxResults | 要在响应中返回的结果数量上限。 | 
| nextToken | 如果结果的数量多于您在 maxResults 字段中指定的数量，响应会返回一个 nextToken 值。要查看下一批结果，请在另一个请求中发送 nextToken 值。 | 

有关您可以指定的用于对结果进行排序和筛选的其他可选参数，请参阅[ListProvisionedModelThroughputs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListProvisionedModelThroughputs.html)。

要列出预配置吞吐量的所有标签，请使用 [Amazon Bedrock 控制平面终端节点](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)发送[ListTagsForResource](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListTagsForResource.html)请求，并附上预配置吞吐量的亚马逊资源名称 (ARN)。

[参阅代码示例](prov-thru-code-examples.md)

------

# 修改预调配吞吐量
<a name="prov-thru-edit"></a>

购买后可以编辑的预配置吞吐量的各个方面取决于配置模式。对于按模型单位划分的预配置吞吐量，您只能编辑预配置吞吐量的名称和标签，如果是自定义模型，则只能编辑模型。

借助按令牌预配置吞吐量，您可以有更多选择，包括修改预配置吞吐量的每分钟输入和输出令牌数量。

请参阅以下章节，详细了解如何编辑要修改的预配置吞吐量类型。

## 按模型单位修改预置吞吐量
<a name="prov-thru-edit-MUs"></a>

您可以编辑现有预调配吞吐量的名称或标签。

以下限制适用于更改与预调配吞吐量关联的模型的情况：
+ 您无法更改与基础模型关联的预调配吞吐量的模型。
+ 如果预调配吞吐量与某个自定义模型关联，您可以将关联更改到进行自定义所基于的基础模型，或者更改到从同一基础模型派生的另一个自定义模型。

在预调配吞吐量更新期间，您可以使用预调配吞吐量运行推理，而不会中断来自终端客户的持续流量。如果更改了与预调配吞吐量关联的模型，您可能会收到旧模型的输出，直到更新完全部署完成。

要了解如何编辑预配置吞吐量，请选择首选方法的选项卡，然后按照以下步骤操作：

------
#### [ Console ]

1. 使用有权使用 Amazon Bedrock 控制台的 IAM 身份登录。AWS 管理控制台然后，在 [https://console.aws.amazon.com/](https://console.aws.amazon.com/bedrock)bedrock 上打开 Amazon Bedrock 控制台。

1. 从左侧导航窗格中，选择**预调配吞吐量**。

1. 从**预调配吞吐量**部分中，选择一个预调配吞吐量。

1. 选择**编辑**。您可以编辑以下字段：
   + **预调配吞吐量的名称** - 更改预调配吞吐量的名称。
   + **选择模型** - 如果预调配吞吐量与某个自定义模型关联，您可以更改关联的模型。

1. 您可以在**标签**部分编辑与预调配吞吐量关联的标签。有关更多信息，请参阅 [标记 Amazon Bedrock 资源](tagging.md)。

1. 要保存更改，请选择**保存编辑内容**。

1. 此时控制台会显示**预调配吞吐量**概览页面。“预调配吞吐量”表中的预调配吞吐量的**状态**将变为**正在更新**。预调配吞吐量更新完毕后，**状态**将变为**服务中**。如果更新失败，**状态**将变为**失败**。

------
#### [ API ]

要编辑预配置吞吐量，请使用 [Amazon Bedrock 控制平面](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)终端节点发送[UpdateProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_UpdateProvisionedModelThroughput.html)请求。

要详细了解请求正文和您需要提供的参数，请参阅[UpdateProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_UpdateProvisionedModelThroughput.html)《*Amazon Bedrock API 参考*》。

如果操作成功，响应会返回 HTTP 200 状态代码。要检查预配置吞吐量何时可供使用，请发送[GetProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetProvisionedModelThroughput.html)请求并检查状态是否为`InService`。当预调配吞吐量的状态为 `Updating` 时，您无法更新或删除该吞吐量。如果更新失败，则其状态将为`Failed`，[GetProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetProvisionedModelThroughput.html)响应中将包含`failureMessage`。

要向预配置吞吐量添加标签，请使用 [Amazon Bedrock 控制平面终端节点](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)发送[TagResource](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_TagResource.html)请求，并包含预配置吞吐量的亚马逊资源名称 (ARN)。请求正文包含一个 `tags` 字段，该字段是一个对象，其中包含您为每个标签指定的键值对。

要从预配置吞吐量中删除标签，请使用 [Amazon Bedrock 控制平面终端节点](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)发送[UntagResource](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_UntagResource.html)请求，并附上预配置吞吐量的亚马逊资源名称 (ARN)。`tagKeys` 请求参数是一个列表，其中包含要移除的标签的键。

[参阅代码示例](prov-thru-code-examples.md)

------

# 将预调配吞吐量与 Amazon Bedrock 资源结合使用
<a name="prov-thru-use"></a>

购买预配置吞吐量后，您可以将其与以下功能一起使用：
+ **模型推断** — 您可以在 Amazon Bedrock 控制台平台中测试预配置吞吐量。准备好部署预调配吞吐量后，设置应用程序以调用预调配模型。选择与您的首选方法对应的选项卡，然后按照以下步骤操作：

------
#### [ Console ]

**在 Amazon Bedrock 控制台操场中使用预调配吞吐量**

  1. 使用有权使用 Amazon Bedrock 控制台的 IAM 身份登录。AWS 管理控制台然后，在 [https://console.aws.amazon.com/](https://console.aws.amazon.com/bedrock)bedrock 上打开 Amazon Bedrock 控制台。

  1. 在左侧导航窗格中，根据应用场景选择**操场**下的**聊天**、**文本**或**图像**。

  1. 选择**选择模型**。

  1. 在 **1. 类别**列中，选择提供商或自定义模型类别。然后，在 **2. 模型**列中，选择与您的预调配吞吐量关联的模型。

  1. 在 **3. 吞吐量**列中，选择您的预调配吞吐量。

  1. 选择**应用**。

  要了解如何使用 Amazon Bedrock 操场，请参阅 [使用操场在控制台中生成响应](playgrounds.md)。

------
#### [ API ]

  要使用预配置吞吐量运行推理，请使用 A [mazon Bedro](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-rt) c [InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html)k 运行时[InvokeModelWithResponseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModelWithResponseStream.html)终端节点发送、、[Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) 或[ConverseStream](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ConverseStream.html)请求。将 `modelId` 参数指定为预调配模型 ARN。要查看针对不同模型的请求正文，请参阅 [基础模型的推理请求参数和响应字段](model-parameters.md)。

  [参阅代码示例](prov-thru-code-examples.md)

------
+ **将预调配吞吐量与代理别名关联** – 在[创建](agents-deploy.md)或[更新](agents-alias-edit.md)代理别名时，您可以将预调配吞吐量与代理别名关联。在 Amazon Bedrock 控制台中，您可以在设置别名或编辑别名时选择预调配吞吐量。在 Amazon Bedrock API `provisionedThroughput` 中，您可以在发送[CreateAgentAlias](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_CreateAgentAlias.html)或 [UpdateAgentAlias](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_UpdateAgentAlias.html); 请求`routingConfiguration`时在中指定。

# 删除预配置吞吐量或取消自动续订
<a name="prov-thru-delete"></a>

您的预配置吞吐量将在每个承诺期限结束时自动续订，同时保持您当前的输入和输出令牌配置。

如果您不想保留您的预配置吞吐量，可以将其删除，或者对于按令牌计算的预配置吞吐量，取消自动续订，以防止在当前期限结束时续订。

## 删除预配置吞吐量
<a name="prov-thru-delete-del"></a>

删除预调配吞吐量后，您将无法再以购买该模型时的吞吐量级别调用该模型。如果您删除了与一个自定义模型关联的预调配吞吐量，该自定义模型不会被删除。要了解如何删除自定义模型，请参阅 [删除自定义模型](model-customization-delete.md)。

**注意**  
在承诺期限结束之前，您不能使用承诺删除按模型单位划分的预配置吞吐量。

要了解如何删除预配置吞吐量，请选择首选方法的选项卡，然后按照以下步骤操作：

------
#### [ Console ]

1. 使用有权使用 Amazon Bedrock 控制台的 IAM 身份登录。AWS 管理控制台然后，在 [https://console.aws.amazon.com/](https://console.aws.amazon.com/bedrock)bedrock 上打开 Amazon Bedrock 控制台。

1. 从左侧导航窗格中，选择**预调配吞吐量**。

1. 从**预调配吞吐量**部分中，选择一个预调配吞吐量。

1. 从 “**操作**” 下拉菜单中选择 “**删除**”。

1. 此时控制台会显示一个模态表单，警告您删除是永久性的。选择**确认**继续。

1. 预调配吞吐量会立即删除。

------
#### [ API ]

要删除预配置吞吐量，请使用 [Amazon Bedrock 控制平面](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)终端节点发送[DeleteProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_DeleteProvisionedModelThroughput.html)请求。指定预调配吞吐量的名称或其 ARN 作为 `provisionedModelId`。如果删除成功，响应会返回 HTTP 200 状态代码。

[参阅代码示例](prov-thru-code-examples.md)

------

## 取消预配置吞吐量的 auto 续订
<a name="prov-thru-delete-cancel-auto-renew"></a>

对于按令牌分配的预配置吞吐量，您可以在承诺期限结束之前随时取消自动续订，以防止预配置吞吐量自动续订。

如果您取消 auto renew，您的预配置吞吐量将一直处于使用状态，直到您的承诺期限结束。无论您是否进行推理，仍将向您收取当前期限的全额准备金。

取消预配置吞吐量的 auto 续订后，在承诺期的剩余时间内，您将无法对预配置吞吐量进行任何进一步的修改。

**注意**  
取消后无法重新启用自动续订。如果您在当前期限到期后需要预配置吞吐量，则需要购买新的预配置吞吐量。

要了解如何取消按令牌自动续订预配置吞吐量，请选择首选方法的选项卡，然后按照以下步骤操作：

------
#### [ Console ]

1. 使用有权使用 Amazon Bedrock 控制台的 IAM 身份登录。AWS 管理控制台然后，在 [https://console.aws.amazon.com/](https://console.aws.amazon.com/bedrock)bedrock 上打开 Amazon Bedrock 控制台。

1. 从左侧导航窗格中，选择**预调配吞吐量**。

1. 从**预调配吞吐量**部分中，选择一个预调配吞吐量。

1. 从 “**操作**” 下拉菜单中选择 “**取消自动续订**”。

1. 控制台会显示一个模态表单，警告您此操作无法撤消。选择**确认**继续。

1. 在当前承诺期限结束之前，预配置吞吐量将一直处于活动状态，之后它将自动删除。

------
#### [ API ]

要取消预配置吞吐量的自动续订，请使用`disableAutoRenew`参数设置为 [Amazon Bedrock 控制平面终端节点](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-cp)发送[UpdateProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_UpdateProvisionedModelThroughput.html)请求。`true`在当前承诺期限结束之前，预配置吞吐量将一直处于活动状态。

[参阅代码示例](prov-thru-code-examples.md)

------

# 预调配吞吐量的代码示例
<a name="prov-thru-code-examples"></a>

以下代码示例演示如何使用和 Python SDK 创建预配置吞吐量以及如何管理和调用预配置吞吐量。AWS CLI您可以根据基础模型或已自定义的模型创建预配置吞吐量。在开始之前，请满足以下先决条件：

**先决条件**

以下示例使用Amazon Nova Lite模型，其模型 ID 为`amazon.nova-lite-v1:0:24k`。如果您还没有，请Amazon Nova Lite按照中的步骤申请访问权限[使用 SDK 和 CLI 管理模型访问权限](model-access.md#model-access-modify)。

如果您想为不同的基础模型或自定义模型购买预配置吞吐量，则必须执行以下操作：

1. 通过执行以下任一操作来查找模型的 ID（对于基础模型）、名称（对于自定义模型）或 ARN（任一模型）：
   + 如果您要为基础模型购买预配置吞吐量，请通过以下方式之一找到支持预配置的模型的 ID 或 Amazon 资源名称 (ARN)：
     + 在表中查找值。
     + 发送[ListFoundationModels](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListFoundationModels.html)请求并指定`byInferenceType`值`PROVISIONED`以查看支持配置的模型列表。在`modelId`或`modelArn`字段中查找值。
   + 如果您要为自定义模型购买预置吞吐量，请通过以下方式之一找到您自定义的模型的名称或 Amazon 资源名称 (ARN)：
     + 在 Amazon Bedrock 控制台中，从左侧导航窗格中选择**自定义模型**。**在模型列表中找到您的自定义**模型**的名称，或者选择它并在**模型详细信息中找到模型 ARN**。**
     + 发送[ListCustomModels](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_ListCustomModels.html)请求并在响应中找到您的自定义模型的`modelName`或`modelArn`值。

1. 在以下示例中修改[InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html)请求的内容，使其与模型正文的格式相匹配[基础模型的推理请求参数和响应字段](model-parameters.md)。`body`

选择与您的首选方法对应的选项卡，然后按照以下步骤操作：

------
#### [ AWS CLI ]

1. 通过在[CreateProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateProvisionedModelThroughput.html)终端中运行以下命令*MyPT*，发送创建名为的无承诺预配置吞吐量的请求：

   ```
   aws bedrock create-provisioned-model-throughput \
      --model-units 1 \
      --provisioned-model-name MyPT \
      --model-id amazon.nova-lite-v1:0:24k
   ```

1. 响应会返回一个 `provisioned-model-arn`。系统需要一些时间来完成创建，请耐心等待。要检查其状态，请发送[GetProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetProvisionedModelThroughput.html)请求并通过运行以下命令提供已置备模型的`provisioned-model-id`名称或 ARN：

   ```
   aws bedrock get-provisioned-model-throughput \
       --provisioned-model-id ${provisioned-model-arn}
   ```

1. 通过发送请求，使用您的预配置模型运行推理。[InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html)提供`CreateProvisionedModelThroughput`响应中返回的预配置模型的 ARN，作为。`model-id`输出将写入当前文件夹*output.txt*中名为的文件中。

   ```
   aws bedrock-runtime invoke-model \
       --model-id ${provisioned-model-arn} \
       --body '{
                   "messages": [{
                       "role": "user",
                       "content": [{
                           "text": "Hello"
                       }]
                   }],
                   "inferenceConfig": {
                       "temperature":0.7
                   }
               }' \
       --cli-binary-format raw-in-base64-out \
       output.txt
   ```

1. 使用以下命令发送删除预配置吞吐量的[DeleteProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_DeleteProvisionedModelThroughput.html)请求。您不必再为预调配吞吐量付费。

   ```
   aws bedrock delete-provisioned-model-throughput 
     --provisioned-model-id MyPT
   ```

------
#### [ Python (Boto) ]

以下代码片段将引导您完成创建预配置吞吐量、获取有关它的信息以及调用预配置吞吐量的过程。

1. 要创建名为的无承诺预置吞吐量，*MyPT*并将预配置吞吐量的 ARN 分配给名为的变量*provisioned\$1model\$1arn*，请发送以下请求：[CreateProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateProvisionedModelThroughput.html)

   ```
   import boto3 
   
   provisioned_model_name = 'MyPT'
   
   bedrock = boto3.client(service_name='bedrock')
   response = bedrock.create_provisioned_model_throughput(
       modelUnits=1,
       provisionedModelName=provisioned_model_name, 
       modelId='amazon.nova-lite-v1:0:24k' 
   )
                           
   provisioned_model_arn = response['provisionedModelArn']
   ```

1. 系统需要一些时间来完成创建，请耐心等待。您可以使用以下代码段检查其状态。您可以提供预配置吞吐量的名称或响应中返回的 ARN [CreateProvisionedModelThroughput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateProvisionedModelThroughput.html)作为。`provisionedModelId`

   ```
   bedrock.get_provisioned_model_throughput(provisionedModelId=provisioned_model_name)
   ```

1. 使用以下命令并使用已配置模型的 ARN 作为，对更新后的预配置模型运行推理。`modelId`

   ```
   import json
   import logging
   import boto3
   
   from botocore.exceptions import ClientError
   
   
   class ImageError(Exception):
       "Custom exception for errors returned by the model"
   
       def __init__(self, message):
           self.message = message
   
   
   logger = logging.getLogger(__name__)
   logging.basicConfig(level=logging.INFO)
   
   
   def generate_text(model_id, body):
       """
       Generate text using your provisioned custom model.
       Args:
           model_id (str): The model ID to use.
           body (str) : The request body to use.
       Returns:
           response (json): The response from the model.
       """
   
       logger.info(
           "Generating text with your provisioned custom model %s", model_id)
   
       brt = boto3.client(service_name='bedrock-runtime')
   
       accept = "application/json"
       content_type = "application/json"
   
       response = brt.invoke_model(
           body=body, modelId=model_id, accept=accept, contentType=content_type
       )
       response_body = json.loads(response.get("body").read())
   
       finish_reason = response_body.get("error")
   
       if finish_reason is not None:
           raise ImageError(f"Text generation error. Error is {finish_reason}")
   
       logger.info(
           "Successfully generated text with provisioned custom model %s", model_id)
   
       return response_body
   
   
   def main():
       """
       Entrypoint for example.
       """
       try:
           logging.basicConfig(level=logging.INFO,
                               format="%(levelname)s: %(message)s")
   
           model_id = provisioned-model-arn
   
           body = json.dumps({
               "inputText": "what isAWS?"
           })
   
           response_body = generate_text(model_id, body)
           print(f"Input token count: {response_body['inputTextTokenCount']}")
   
           for result in response_body['results']:
               print(f"Token count: {result['tokenCount']}")
               print(f"Output text: {result['outputText']}")
               print(f"Completion reason: {result['completionReason']}")
   
       except ClientError as err:
           message = err.response["Error"]["Message"]
           logger.error("A client error occurred: %s", message)
           print("A client error occured: " +
                 format(message))
       except ImageError as err:
           logger.error(err.message)
           print(err.message)
   
       else:
           print(
               f"Finished generating text with your provisioned custom model {model_id}.")
   
   
   if __name__ == "__main__":
       main()
   ```

1. 使用以下代码段删除预调配吞吐量。您不必再为预调配吞吐量付费。

   ```
   bedrock.delete_provisioned_model_throughput(provisionedModelId=provisioned_model_name)
   ```

------

# Amazon Bedrock 的配额
<a name="quotas"></a>

您的 Amazon Bedrock AWS 账户 有默认配额，以前称为限制。要查看 Amazon Bedrock 的服务配额，请执行以下操作之一：
+ 按照[查看服务配额](https://docs.aws.amazon.com/servicequotas/latest/userguide/gs-request-quota.html)中的步骤操作，然后选择 **Amazon Bedrock** 作为服务。
+ 参阅《 AWS 一般参考》中的 [Amazon Bedrock 服务配额](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#limits_bedrock)。

Amazon Bedrock 中的模型推理受词元使用配额控制。部分模型使用词元的比率较高。有关这些比率以及如何优化词元使用的更多信息，请参阅 [Amazon Bedrock 中词元的计算方式](quotas-token-burndown.md)。

为了保持服务的性能并确保适当使用 Amazon Bedrock，分配给账户的默认配额可能会根据地区因素、付款历史记录、欺诈性使用、[配额增加请求](quotas-increase.md)的 and/or 批准进行更新。

**Topics**
+ [Amazon Bedrock 中词元的计算方式](quotas-token-burndown.md)
+ [在运行推理之前，通过计算词元来监控您的词元使用情况](count-tokens.md)
+ [请求增加 Amazon Bedrock 配额](quotas-increase.md)

# Amazon Bedrock 中词元的计算方式
<a name="quotas-token-burndown"></a>

当您运行模型推理时，可以处理的词元数量存在配额，具体取决于您使用的 Amazon Bedrock 模型。请查看与词元配额相关的以下术语：


****  

| 租期 | 定义 | 
| --- | --- | 
| InputTokenCount | A CloudWatch mazon Bedrock 运行时指标，表示作为模型输入提供的请求中的令牌数量。 | 
| OutputTokenCount |  CloudWatch Amazon Bedrock 运行时指标，表示模型为响应请求而生成的令牌数量。 | 
| CacheReadInputTokens |  CloudWatch Amazon Bedrock 运行时指标，表示成功从缓存中检索而非模型重新处理的输入令牌的数量。如果您不使用[提示缓存](prompt-caching.md)，则此值为 0。 | 
| CacheWriteInputTokens |  CloudWatch Amazon Bedrock 运行时指标，表示成功写入缓存的输入令牌的数量。如果您不使用[提示缓存](prompt-caching.md)，则此值为 0。 | 
| 每分钟词元数（TPM） |  AWS 在模型级别为一分钟内可以使用的令牌数量（包括输入和输出）设置的配额。 | 
| 每日词元数（TPD） |  AWS 在模型级别对一天内可以使用的代币数量（包括输入和输出）设置的配额。默认情况下，此值为 TPM x 24 x 60。但是，新的配额减少 AWS 账户 了。 | 
| 每分钟请求数（RPM） |  AWS 在模型级别为你在一分钟内可以发送的请求数量设定的配额。 | 
| max\$1tokens | 您在请求中提供的参数，用于设置模型可以生成的最大输出词元数。 | 
| 消耗比率 | 输入和输出词元转换为词元配额使用量的比率，用于系统节流。 | 

Anthropic Claude 模型版本 3.7 及更高版本的输出代币消耗率为 **5 倍（1 个输出代币**消耗配额中的 5 个代币）：

对于所有其他模型，消耗比率为 **1:1**（在配额中，1 个输出词元消耗 1 个词元）。

**Topics**
+ [了解词元配额管理](#quotas-token-burndown-management)
+ [了解 max\$1tokens 参数的影响](#quotas-token-burndown-max-tokens)
+ [优化 max\$1tokens 参数](#quotas-token-burndown-max-tokens-optimize)

## 了解词元配额管理
<a name="quotas-token-burndown-management"></a>

当您发出请求时，词元将从您的 TPM 和 TPD 配额中扣除。计算分为以下阶段：
+ **请求开始时**：假设您没有超出 RPM 配额，将从您的配额中扣除以下总额。如果您超出配额，请求会被节流。

  ```
  Total input tokens + max_tokens
  ```
+ **处理期间**：请求消耗的配额会定期调整，以涵盖生成的实际输出词元数。
+ **在请求结束时**：请求消耗的词元总数将按以下方式计算，所有未使用的词元将补充到您的配额：

  ```
  InputTokenCount + CacheWriteInputTokens + (OutputTokenCount x burndown rate)
  ```

  如果未使用[提示缓存](prompt-caching.md)，则 `CacheWriteInputTokens` 为 0。`CacheReadInputTokens` 不参与此计算。

**注意**  
您只需根据词元的实际使用量付费。  
例如，如果您使用 Anthropic Claude Sonnet 4 并发送一个包含 1000 个输入词元的请求，而该模型生成一个等效于 100 个词元的响应：  
系统会从您的 TPM 和 TPD 配额中扣除 **1500 个词元**（1000 \$1 100 x 5）。
您只需为 **1100 个词元**付费。

## 了解 max\$1tokens 参数的影响
<a name="quotas-token-burndown-max-tokens"></a>

`max_tokens` 值将在每次请求开始时从您的配额中扣除。如果您比预期更早达到 TPM 配额上限，请尝试降低 `max_tokens`，使其更接近您实际生成内容的规模。

以下场景提供了一些示例，用来说明在使用输出词元消耗比率为 5 倍的模型时，对于已完成的请求，将如何计算配额扣除。

### 场景 1：max\$1tokens 值过高
<a name="quotas-token-burndown-max-tokens-too-high"></a>

参数设置如下：
+ **InputTokenCount:** 3,000
+ **CacheReadInputTokens:** 4,000
+ **CacheWriteInputTokens:** 1,000
+ **OutputTokenCount:** 1,000
+ **max\$1tokens：**32000

配额扣除情况如下：
+ **发出请求时的初始扣除额：**40000（= 3000 \$1 4000 \$1 1000 \$1 32000）
+ **生成响应后的最终调整扣除额：**9000（= 3000 \$1 1000 \$1 1000 x 5）

在本场景中，由于 `max_tokens` 参数设置过高，可发起的并发请求数量减少。这会降低请求并发性、吞吐量和配额利用率，因为很快就会达到 TPM 配额容量。

### 场景 2：max\$1tokens 值经过优化
<a name="quotas-token-burndown-max-tokens-optimized"></a>

参数设置如下：
+ **InputTokenCount:** 3,000
+ **CacheReadInputTokens:** 4,000
+ **CacheWriteInputTokens:** 1,000
+ **OutputTokenCount:** 1,000
+ **max\$1tokens：**1250

配额扣除情况如下：
+ **发出请求时的初始扣除额：**9250（= 3000 \$1 4000 \$1 1000 \$1 1250）
+ **生成响应后的最终调整扣除额：**9000（= 3000 \$1 1000 \$1 1000 x 5）

在本场景中，`max_tokens` 参数经过优化，因为初始扣除额仅略高于最终调整后扣除额。这有助于提高请求并发性、吞吐量和配额利用率。

## 优化 max\$1tokens 参数
<a name="quotas-token-burndown-max-tokens-optimize"></a>

通过优化 `max_tokens` 参数，您可以有效利用分配的配额容量。为了帮助您就此参数做出决定，您可以使用亚马逊，它会自动从 AWS 服务中收集指标 CloudWatch，包括Amazon Bedrock中的代币使用数据。

词元记录在 `InputTokenCount` 和 `OutputTokenCount` 运行时指标中（有关更多指标，请参阅 [Amazon Bedrock 运行时指标](monitoring.md#runtime-cloudwatch-metrics)）。

要使用 CloudWatch 监控来告知您对`max_tokens`参数的决定，请在中执行以下操作 AWS 管理控制台：

1. 通过 [https://console.aws.amazon.com/cloudwatch](https://console.aws.amazon.com/cloudwatch) 登录亚马逊 CloudWatch 控制台。

1. 从左侧导航窗格中选择**控制面板**。

1. 选择**自动控制面板**选项卡。

1. 选择 **Bedrock**。

1. 在**按模型统计的词元数**控制面板中，选择展开图标。

1. 为指标选择持续时间和范围参数，以便应对峰值使用量。

1. 在标记为**总和**的下拉菜单中，您可以选择不同的指标来观察词元使用情况。查看这些指标可以为设置 `max_tokens` 值提供依据。

# 在运行推理之前，通过计算词元来监控您的词元使用情况
<a name="count-tokens"></a>

运行模型推理时，您在输入中发送的词元数量，会计入到请求的成本中，还会计入到您每分钟和每天可以使用的词元配额中。该 [CountTokens](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_CountTokens.html)API 可帮助您在向基础模型发送请求之前估算令牌使用情况，方法是返回在推理请求中向模型发送相同输入时将使用的令牌数量。

**注意**  
使用 [CountTokens](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_CountTokens.html)API 不会产生任何费用。

词元计数特定于模型，因为不同的模型使用不同的词元计算策略。此操作返回的词元计数，与向模型发送相同输入来运行推理时将收费的词元计数相符。

您可以使用 `CountTokens` API 来执行以下操作：
+ 在发送推理请求之前估算成本。
+ 优化提示以适应词元限制。
+ 规划应用程序中的词元使用量。

**Topics**
+ [支持词元计数的模型和区域](#count-tokens-supported)
+ [计算请求中的词元](#count-tokens-use)
+ [尝试示例](#count-tokens-example)

## 支持词元计数的模型和区域
<a name="count-tokens-supported"></a>

下表显示了基础模型对代币计数的支持：


| Provider | 模型 | 模型 ID | 支持单区域模型 | 
| --- | --- | --- | --- | 
| Anthropic | Claude 3.5 Haiku | anthropic.claude-3-5-haiku-20241022-v1:0 |  us-west-2  | 
| Anthropic | Claude 3.5 Sonnet | anthropic.claude-3-5-sonnet-20240620-v1:0 |  ap-northeast-1 ap-southeast-1 eu-central-1 eu-central-2 us-east-1 us-west-2  | 
| Anthropic | Claude 3.5 Sonnet v2 | anthropic.claude-3-5-sonnet-20241022-v2:0 |  ap-southeast-2 us-west-2  | 
| Anthropic | Claude 3.7 Sonnet | anthropic.claude-3-7-sonnet-20250219-v1:0 |  eu-west-2  | 
| Anthropic | Claude Opus 4 | anthropic.claude-opus-4-20250514-v1:0 |  | 
| Anthropic | Claude Sonnet 4 | anthropic.claude-sonnet-4-20250514-v1:0 |  | 

## 计算请求中的词元
<a name="count-tokens-use"></a>

要计算推理请求中输入令牌的数量，请发送带有 [Amazon Bedrock 运行时终端节点](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#br-rt)的[CountTokens](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_CountTokens.html)请求，在标题中指定模型，在字段中指定要计算令牌的输入。`body`该`body`字段的值取决于您是在计算输入令牌还是 C [onverse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) 请求[InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html)的输入标记：
+ 对于 `InvokeModel` 请求，`body` 的格式是表示一个 JSON 对象的字符串，其格式取决于您指定的模型。
+ 对于 `Converse` 请求，`body` 的格式是一个 JSON 对象，指定对话中包含的 `messages` 和 `system` 提示。

## 尝试示例
<a name="count-tokens-example"></a>

此部分中的示例让您可以对通过 `InvokeModel` 和 `Converse` 发出的 Anthropic 和 Claude 3 Haiku 计算词元数量。

**先决条件**
+ 您已下载适用于 Python (Boto3) 的 AWS SDK并且您的配置已设置为可以自动识别您的凭据和默认AWS区域。
+ 您的 IAM 身份有权执行以下操作（有关更多信息，请参阅 [Amazon Bedrock 的操作、资源和条件键](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonbedrock.html)）：
  + 基岩:CountTokens —允许使用. `CountTokens` 
  + 基岩:InvokeModel — 允许使用`InvokeModel`和。`Converse`应至少将范围限*arn:\$1\$1Partition\$1:bedrock:\$1\$1Region\$1::foundation-model/anthropic.claude-3-haiku-20240307-v1:0*定为。

要尝试计算[InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html)请求的令牌，请运行以下 Python 代码：

```
import boto3
import json

bedrock_runtime = boto3.client("bedrock-runtime")

input_to_count = json.dumps({
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 500,
    "messages": [
        {
            "role": "user",
            "content": "What is the capital of France?"
        }
    ]
})

response = bedrock_runtime.count_tokens(
    modelId="anthropic.claude-3-5-haiku-20241022-v1:0",
    input={
        "invokeModel": {
            "body": input_to_count
        }
    }
)

print(response["inputTokens"])
```

要尝试计算用于 [Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) 请求的词元，请运行以下 Python 代码：

```
import boto3
import json 

bedrock_runtime = boto3.client("bedrock-runtime")

input_to_count = {
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "text": "What is the capital of France?"
                }
            ]
        },
        {
            "role": "assistant",
            "content": [
                {
                    "text": "The capital of France is Paris."
                }
            ]
        },
        {
            "role": "user",
            "content": [
                {
                    "text": "What is its population?"
                }
            ]
        }
    ],
    "system": [
        {
            "text": "You're an expert in geography."
        }
    ]
}

response = bedrock_runtime.count_tokens(
    modelId="anthropic.claude-3-5-haiku-20241022-v1:0",
    input={
        "converse": input_to_count
    }
)

print(response["inputTokens"])
```

# 请求增加 Amazon Bedrock 配额
<a name="quotas-increase"></a>

申请增加账户配额的步骤取决于 [Amazon Bedrock 服务配额](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#limits_bedrock)中配额表的**可调整**列中的值：
+ 如果某个配额被标记为**是**，则可以按照《服务配额用户指南》中[请求增加配额](https://docs.aws.amazon.com/servicequotas/latest/userguide/request-quota-increase.html)的步骤进行调整。
+ 对于任何模型，您都可以同时请求增加以下配额：
  + *\$1\$1model\$1* 的跨区域 InvokeModel 每分钟词元数
  + *\$1\$1model\$1* 的跨区域 InvokeModel 每分钟请求数
  + *\$1\$1model\$1* 的按需 InvokeModel 每分钟词元数
  + *\$1\$1model\$1* 的按需 InvokeModel 每分钟请求数
  + *\$1\$1model\$1* 的模型调用每天最大词元数

  要请求增加这些配额的任意组合，请按照《服务配额用户指南》中[请求增加配额](https://docs.aws.amazon.com/servicequotas/latest/userguide/request-quota-increase.html)的步骤，请求增加 ***\$1\$1model\$1* 配额的跨区域 InvokeModel 每分钟词元数**。完成申请后，支持团队将与您联系，并为您提供增加其他四个配额的选项。
**注意**  
由于需求过大，将优先考虑那些产生的流量消耗了现有配额分配的客户。如果您不符合此条件，将可能拒绝您的请求。

# 用于加快模型推理速度的提示缓存
<a name="prompt-caching"></a>

提示缓存是一项可选功能，可以在 Amazon Bedrock 上与受支持的模型配合使用，来降低推理响应延迟和输入词元成本。通过将部分上下文添加到缓存，模型可以利用缓存来跳过对输入的重新计算过程，从而使 Bedrock 能够共享计算资源来节省成本并降低响应延迟。

当您的工作负载包含冗长而重复的上下文，并且这些上下文频繁用于多个查询时，提示缓存会有所帮助。例如，如果您有一个聊天机器人，用户可以在其中上传文档并询问有关文档的问题，那么每次用户提供输入时，模型都需要处理文档，这可能非常耗时。利用提示缓存，您可以缓存文档，这样后续包含该文档的查询就无需重新处理文档。

使用提示缓存时，从缓存读取的词元将按较低费率计费。根据模型的不同，写入缓存的词元的费率可能高于未缓存输入词元。所有未从缓存读取或写入缓存的词元都按该模型的标准输入词元费率计费。有关更多信息，请参阅 [Amazon Bedrock 定价](https://aws.amazon.com/bedrock/pricing/)页面。

## 工作原理
<a name="prompt-caching-overview"></a>

如果您选择使用提示缓存，Amazon Bedrock 会创建一个由*缓存检查点*组成的缓存。缓存检查点是一些标记，用于定义您希望缓存的提示的连续分段（通常称为提示前缀）。这些提示前缀在不同请求之间应保持静态，如果后续请求中对提示前缀进行了更改，则会导致缓存丢失。

缓存检查点具有最小和最大词元数，具体取决于您使用的特定模型。仅当您的提示前缀总数达到最小词元数时，您才能创建缓存检查点。例如，Anthropic Claude 3.7 Sonnet 模型要求每个缓存检查点包含至少 1024 个词元。这意味着您的第一个缓存检查点可以定义在第 1024 个词元之后，第二个缓存检查点可以定义在第 2048 个词元之后。如果您在未达到最小词元数时尝试添加缓存检查点，推理仍然会成功，但系统不会缓存您的前缀。缓存有一个 Time To Live (TTL)，它会在每次成功命中缓存时重置。在此期间，缓存中的上下文将被保留。如果在 TTL 时段内没有发生缓存命中，缓存将过期。大多数型号支持 5 分钟 TTL，而 Claude Opus 4.5Claude Haiku 4.5，Claude Sonnet 4.5还支持延长 1 小时 TTL 选项。

每当您在 Amazon Bedrock 中使用受支持的模型进行模型推理时，您都可以使用提示缓存。以下 Amazon Bedrock 功能支持提示缓存：

**Converse 和 ConverseStream APIs**  
您可以与模型进行对话，并在提示中指定缓存检查点。

**InvokeModel 和 InvokeModelWithResponseStream APIs**  
您可以提交单独的提示请求，在请求中启用提示缓存并指定缓存检查点。

**跨区域推理的提示缓存**  
提示缓存可以与跨区域推理结合使用。跨区域推理会自动选择您所在地理 AWS 区域内的最佳区域来满足您的推理请求，从而最大限度地提高可用资源和模型可用性。在需求高峰期，这些优化可能会导致缓存写入量增加。

**Amazon Bedrock 提示管理器**  
[创建](prompt-management-create.md)或[修改](prompt-management-modify.md)提示时，您可以选择启用提示缓存。根据模型，您可以缓存系统提示、系统指令和消息（用户和助手）。您也可以选择禁用提示缓存。

它们为您 APIs 提供了对提示缓存的最大灵活性和精细控制。您可以在提示中设置单独的缓存检查点。您可以通过创建更多缓存检查点来增加缓存内容，创建的数量上限为特定模型允许的最大缓存检查点数量。有关更多信息，请参阅 [支持的模型、区域和限制](#prompt-caching-models)。

## 支持的模型、区域和限制
<a name="prompt-caching-models"></a>

下表列出了支持的模型及其最小词元数、最大缓存检查点数以及可以使用缓存检查点的字段。


| 模型名称 | 模型 ID | 版本类型 | 每个缓存检查点的最小词元数 | 每个请求的最大缓存检查点数 | 支持的 TTL | 接受提示缓存检查点的字段 | 
| --- | --- | --- | --- | --- | --- | --- | 
| Claude Opus4.5 | anthropic.claude-opus-4-5-20251101-v 1:0 | 正式发布 | 4,096 | 4 | 5 分钟 1 小时 | “system”、“messages”和“tools” | 
| Claude Opus4.1 | anthropic.claude-opus-4-1-20250805-v1:0 | 正式发布 | 1024 | 4 | 5 分钟 | “system”、“messages”和“tools” | 
| Claude Opus 4 | anthropic.claude-opus-4-20250514-v1:0 | 正式发布 | 1024 | 4 | 5 分钟 | “system”、“messages”和“tools” | 
| Claude Sonnet 4.5 | anthropic.claude-sonnet-4-5-20250929-v1:0 | 正式发布 | 1024 | 4 | 5 分钟 1 小时 | “system”、“messages”和“tools” | 
| Claude Haiku 4.5 | anthropic.claude-haiku-4-5-20251001-v1:0 | 正式发布 | 4,096 | 4 | 5 分钟 1 小时 | “system”、“messages”和“tools” | 
| Claude Sonnet 4 | anthropic.claude-sonnet-4-20250514-v1:0 | 正式发布 | 1024 | 4 | 5 分钟 | “system”、“messages”和“tools” | 
| Claude 3.7 Sonnet | anthropic.claude-3-7-sonnet-20250219-v1:0 | 正式发布 | 1024 | 4 | 5 分钟 | “system”、“messages”和“tools” | 
| Claude 3.5 Haiku | anthropic.claude-3-5-haiku-20241022-v1:0 | 正式发布 | 2,048 | 4 | 5 分钟 | “system”、“messages”和“tools” | 
| Claude 3.5 Sonnet v2 | anthropic.claude-3-5-sonnet-20241022-v2:0 | 预览 | 1024 | 4 | 5 分钟 | “system”、“messages”和“tools” | 
| Amazon Nova Micro | 亚马逊。 nova-micro-v1:0 | 正式发布 | 10001 | 4 | 5 分钟 | “system”和“messages” | 
| Amazon Nova Lite | 亚马逊。 nova-lite-v1:0 | 正式发布 | 10001 | 4 | 5 分钟 | “system”和“messages”2 | 
| Amazon Nova Pro | 亚马逊。 nova-pro-v1:0 | 正式发布 | 10001 | 4 | 5 分钟 | “system”和“messages”2 | 
| Amazon Nova Premier | 亚马逊。 nova-premier-v1:0 | 正式发布 | 10001 | 4 | 5 分钟 | “system”和“messages”2 | 
| 亚马逊 Nova 2 Lite | amazon.nova-2-lite-v 1:0 | 正式发布 | 10001 | 4 | 5 分钟 | “system”和“messages”2 | 

1：Amazon Nova 模型对于提示缓存支持最多 2 万个词元。

2：提示缓存主要用于文本提示。

要在支持的型号（Claude Opus4.5、和Claude Sonnet 4.5）上使用 1 小时 TTL 选项Claude Haiku 4.5，请在缓存检查点中指定该`ttl`字段。在 Converse API 中，添加`"ttl": "1h"`至您的`cachePoint`对象。在 Claude 模型的 InvokeModel API 中，添加`"ttl": "1h"`至您的`cache_control`对象。如果未提供任何`ttl`值，则适用默认的 5 分钟缓存行为。1 小时 TTL 对于运行时间较长的会话或需要长时间维护缓存的批处理场景非常有用。

Amazon Nova 为所有文本提示（包括 `User` 和 `System` 消息）提供了自动提示缓存。这种机制可在提示以重复部分开头时降低延迟，即使没有显式配置也是如此。但是，为了实现成本节省和确保更稳定的性能优势，建议选择使用**显式提示缓存**。

## 简化 Claude 模型的缓存管理
<a name="prompt-caching-simplified"></a>

对于 Claude 模型，Amazon Bedrock 提供了简化的缓存管理方法，可减少手动放置缓存检查点时的复杂性。您无需指定确切的缓存检查点位置，只需在静态内容末尾设置一个断点即可使用自动缓存管理。

启用简化的缓存管理后，系统会自动在之前的内容块边界处检查缓存命中情况，从指定的断点开始回溯最多约 20 个内容块。这样，模型就可以从缓存中找到最长的匹配前缀，无需您预测最佳检查点位置。要使用此功能，请在静态内容的末尾且在任何动态或可变内容之前放置一个缓存检查点。系统将自动查找最佳缓存匹配。

为了实现更精细的控制，您仍然可以使用多个缓存检查点（Claude 模型最多支持 4 个）来指定确切的缓存边界。如果您要缓存的内容片段更新频率不同，或者您希望更精确地控制哪些内容会被缓存时，就应该使用多个缓存检查点。

**重要**  
自动前缀检查只会从缓存检查点回溯大约 20 个内容块。如果您的静态内容超出了此范围，建议使用多个缓存检查点，或者重构提示以将最常重复使用的内容置于此范围内。

## 如何有效使用提示缓存
<a name="prompt-caching-effective-use"></a>

如果您有按常规节奏使用的提示（即系统提示的使用频率高于每 5 分钟一次），请继续使用 5 分钟缓存，因为该缓存将继续刷新，无需额外付费。

1 小时缓存最适合用于以下场景：
+ 当你的提示使用频率可能低于 5 分钟，但频率高于每小时使用频率时。例如，当代理副代理花费的时间将超过 5 分钟，或者存储与用户的长时间聊天对话时，您通常预计该用户在接下来的 5 分钟内可能不会做出回应。
+ 当延迟很重要并且您的后续提示可能会超过 5 分钟时。
+ 当你想提高速率限制利用率时，因为缓存命中率不会从速率限制中扣除。

您可以在同一个请求中同时使用 1 小时和 5 分钟的缓存控制，但有一个重要的限制：TTL 较长的缓存条目必须出现在较短的缓存条目之前 TTLs （即，1 小时的缓存条目必须出现在任何 5 分钟的缓存条目之前）。

## 开始使用
<a name="prompt-caching-get-started"></a>

以下部分针对通过 Amazon Bedrock 与模型进行交互的每种方法，简要概述了如何使用提示缓存功能。

### Converse API
<a name="prompt-caching-converse"></a>

[Converse](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html) API 为在多回合对话中实施提示缓存供了先进而灵活的方案。有关各个模型的提示要求的更多信息，请参阅之前的[支持的模型、区域和限制](#prompt-caching-models)部分。

**示例请求**

以下示例展示在发送到 Converse API 请求的 `messages`、`system` 或 `tools` 字段中设置的缓存检查点。对于给定请求，您可以将检查点放在这些位置中的任何一个。例如，如果要向 Claude 3.5 Sonnet v2 模型发送请求，您可以在 `messages` 中放置两个缓存检查点，以及分别在 `system` 和 `tools` 中放置一个缓存检查点。有关构造和发送 Converse API 请求的更多详细信息以及示例，请参阅[使用 Converse API 操作进行对话](conversation-inference.md)。

按如下方式指定所需的 ttl 值，如果未指定 ttl 值，则适用 5 分钟缓存的默认行为。

```
"cachePoint" : {
    "type": "default",
    "ttl" : "5m | 1h"
}
```

------
#### [ messages checkpoints ]

在此示例中，第一个 `image` 字段为模型提供图像，第二个 `text` 字段要求模型分析图像。只要 `content` 对象中 `cachePoint` 前面的词元数达到了模型的最小词元数，系统就会创建一个缓存检查点。

```
...
"messages": [
   {
        "role": "user",
        "content": [
            {
                "image": {
                    "bytes": "asfb14tscve..."
                }
            },
            {
                "text": "What's in this image?"
            },
            {
                "cachePoint": {
                    "type": "default"
                }
            }
      ]
  }
]
...
```

------
#### [ system checkpoints ]

在此示例中，您将在 `text` 字段中提供系统提示。此外，您还可以添加一个 `cachePoint` 字段来缓存系统提示。

```
...
  "system": [ 
    {
        "text": "You are an app that creates play lists for a radio station that plays rock and pop music. Only return song names and the artist. "
    },
    {
        "cachePoint": {
            "type": "default"
        }
    }
  ],
...
```

------
#### [ tools checkpoints ]

在此示例中，您在 `toolSpec` 字段中提供工具定义。（或者，您也可以调用之前定义的工具。有关更多信息，请参阅[使用工具完成 Amazon Bedrock 模型响应](tool-use.md)。） 之后，您可以添加 `cachePoint` 字段来缓存该工具。

```
...
toolConfig={
    "tools": [
        {
            "toolSpec": {
                "name": "top_song",
                "description": "Get the most popular song played on a radio station.",
                "inputSchema": {
                    "json": {
                        "type": "object",
                        "properties": {
                            "sign": {
                                "type": "string",
                                "description": "The call sign for the radio station for which you want the most popular song. Example calls signs are WZPZ and WKRP."
                            }
                        },
                        "required": [
                            "sign"
                        ]
                    }
                }
            }
        },
        {
                "cachePoint": {
                    "type": "default"
                }
        }
    ]
}
...
```

------

来自 Converse API 的模型响应包括三个专门用于提示缓存的新字段。`CacheReadInputTokens` 值表示由于您之前的请求而从缓存中读取的词元数量，`CacheWriteInputTokens` 值表示写入缓存的词元数量。这些`CacheDetails`值告诉你写入缓存的令牌数量所使用的 ttl。Amazon Bedrock 根据这两个值向您收取费用，其费率会低于完整模型推理。

### InvokeModel API
<a name="prompt-caching-invoke"></a>

默认情况下，当您调用 [InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html)API 时，提示缓存处于启用状态。您可以在请求正文中的任何位置设置缓存检查点，这与前面的 Converse API 示例类似。

------
#### [ Anthropic Claude ]

以下示例说明如何构建 Anthropic Claude 3.5 Sonnet v2 模型 InvokeModel请求的正文。请注意， InvokeModel 请求正文的确切格式和字段可能会因您选择的模型而异。要查看不同模型的请求和响应正文的格式和内容，请参阅[基础模型的推理请求参数和响应字段](model-parameters.md)。

按如下方式指定所需的 ttl 值，如果未指定 ttl 值，则适用 5 分钟缓存的默认行为。

```
"cache_control" : {
    "type": "ephemeral",
    "ttl" : "5m | 1h"
}
```

```
body={
        "anthropic_version": "bedrock-2023-05-31",
        "system":"Reply concisely",
        "messages": [
            {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Describe the best way to learn programming."
                },
                {
                    "type": "text",
                    "text": "Add additional context here for the prompt that meets the minimum token requirement for your chosen model.",
                    "cache_control": {
                        "type": "ephemeral"
                    }
                }
            ]
            }
        ],
        "max_tokens": 2048,
        "temperature": 0.5,
        "top_p": 0.8,
        "stop_sequences": [
            "stop"
        ],
        "top_k": 250
}
```

------
#### [ Amazon Nova ]

以下示例说明如何构造Amazon Nova模型 InvokeModel请求的正文。请注意， InvokeModel 请求正文的确切格式和字段可能会因您选择的模型而异。要查看不同模型的请求和响应正文的格式和内容，请参阅[基础模型的推理请求参数和响应字段](model-parameters.md)。

```
{
    "system": [{
        "text": "Reply Concisely"
    }],
    "messages": [{
        "role": "user",
        "content": [{
            "text": "Describe the best way to learn programming"
        },
        {
            "text": "Add additional context here for the prompt that meets the minimum token requirement for your chosen model.",
            "cachePoint": {
                "type": "default"
            }
        }]
    }],
    "inferenceConfig": {
        "maxTokens": 300,
        "topP": 0.1,
        "topK": 20,
        "temperature": 0.3
    }
}
```

------

有关发送 InvokeModel 请求的更多信息，请参阅[使用以下命令提交单个提示 InvokeModel](inference-invoke.md)。

### 演练场
<a name="prompt-caching-playground"></a>

在 Amazon Bedrock 控制台的聊天演练场中，您可以开启提示缓存选项，Amazon Bedrock 会自动为您创建缓存检查点。

按照[使用操场在控制台中生成响应](playgrounds.md)中的说明，在 Amazon Bedrock 演练场中开始使用提示。对于支持的模型，提示缓存会在演练场中自动开启。但是，如果未自动开启提示缓存，请执行以下操作来启用：

1. 在左侧面板中，打开**配置**菜单。

1. 打开**提示缓存**开关。

1. 运行您的提示。

在您的输入和模型响应组合达到检查点所需的最小词元数（因模型而异）后，Amazon Bedrock 会自动为您创建第一个缓存检查点。随着您继续聊天，后续每次达到最小词元数时，就会创建一个新的检查点，最多不超过模型允许的最大检查点数。您可以随时选择**提示缓存**开关旁边的**查看缓存检查点**来查看缓存检查点，如以下屏幕截图所示。

![\[Amazon Bedrock 文本演练场中用于提示缓存的 UI 开关。\]](http://docs.aws.amazon.com/zh_cn/bedrock/latest/userguide/images/prompt-caching/bedrock-prompt-caching-ui-toggle.png)


通过查看演练场响应中的**缓存指标**弹出窗口，您可以了解每次与模型交互时，从缓存中读取的词元数以及写入缓存的词元数，弹出窗口内容为：![\[The metrics icon shown in model responses when prompt caching is enabled.\]](http://docs.aws.amazon.com/zh_cn/bedrock/latest/userguide/images/prompt-caching/bedrock-prompt-caching-metrics-icon.png)

![\[缓存指标框，显示了从缓存中读取和写入缓存的词元数。\]](http://docs.aws.amazon.com/zh_cn/bedrock/latest/userguide/images/prompt-caching/bedrock-prompt-caching-metrics.png)


如果您在对话进行中关闭了提示缓存开关，仍可继续与模型聊天。