適應性思維的運作方式使用工作量參數進行適應性思考搭配 Converse API 使用自適應思維提示快取調校思考行為

適應性思維

調適性思維是延伸思考搭配 4.6 Claude Opus 使用的建議方法。調整式思考不會手動設定思維字符預算，而是讓根據每個請求的複雜性，Claude動態決定思考的時間和程度。自適應思維使用固定的可靠地推動比延伸思維更好的效能budget_tokens，我們建議您改用自適應思維，從 4.6 Claude Opus 取得最智慧的回應。不需要 Beta 標頭。

支援的模型如下：

模型	模型 ID
Claude Opus 4.7	`anthropic.claude-opus-4-7`
Claude Mythos 預覽	`anthropic.claude-mythos-preview`
Claude Opus 4.6	`anthropic.claude-opus-4-6-v1`
Claude Sonnet 4.6	`anthropic.claude-sonnet-4-6`

注意

Claude Opus 4.7 和 Claude Mythos Preview 僅支援適應性思維。這些模型不支援手動延伸思考 (thinking.type: "enabled" 搭配 budget_tokens)，並會傳回 400 錯誤。

thinking.type: "enabled" 和 budget_tokens 已在 Claude Opus 4.6 和 Claude Sonnet 4.6 上棄用，並將在未來的模型版本中移除。請改為使用 thinking.type: "adaptive"搭配工作量參數。

較舊的模型 Claude Opus (Claude Sonnet 4.5、4.5 等）不支援適應性思維，需要使用 thinking.type: "enabled" 。 budget_tokens

適應性思維的運作方式

在適應性模式中，會Claude評估每個請求的複雜性，並決定是否要考慮和考慮多少。在預設的工作量層級 (high) 中，幾乎一律Claude會考慮。在較低的工作量層級， Claude 可能會略過對更簡單問題的思考。

自適應思維也會自動啟用交錯思考 (Beta 版)。這表示Claude可以在工具呼叫之間進行思考，讓它對客服人員工作流程特別有效。

在您的 API 請求"adaptive"中thinking.type將設定為：

CLI


aws bedrock-runtime invoke-model \
--model-id "us.anthropic.claude-opus-4-6-v1" \
--body '{
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 16000,
"thinking": {
"type": "adaptive"
},
"messages": [
{
"role": "user",
"content": "Three players A, B, C play a game. Each has a jar with 100 balls numbered 1-100. Simultaneously, each draws one ball. A beats B if As number > Bs number (mod 100, treating 100 as 0 for comparison). Similarly for B vs C and C vs A. The overall winner is determined by majority of pairwise wins (ties broken randomly). Is there a mixed strategy Nash equilibrium where each player draws uniformly? If not, characterize the equilibrium."
}
]
}' \
--cli-binary-format raw-in-base64-out \
output.json && cat output.json | jq '.content[] | {type, thinking: .thinking[0:200], text}'

Python


import boto3
import json

bedrock_runtime = boto3.client(
    service_name='bedrock-runtime',
    region_name='us-east-2'
)

response = bedrock_runtime.invoke_model(
    modelId="us.anthropic.claude-opus-4-6-v1",
    body=json.dumps({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 16000,
        "thinking": {
            "type": "adaptive"
        },
        "messages": [{
            "role": "user",
            "content": "Explain why the sum of two even numbers is always even."
        }]
    })
)

response_body = json.loads(response["body"].read())

for block in response_body["content"]:
    if block["type"] == "thinking":
        print(f"\nThinking: {block['thinking']}")
    elif block["type"] == "text":
        print(f"\nResponse: {block['text']}")

TypeScript


import { BedrockRuntimeClient, InvokeModelCommand } from "@aws-sdk/client-bedrock-runtime";

async function main() {
    const client = new BedrockRuntimeClient({});

    const command = new InvokeModelCommand({
        modelId: "us.anthropic.claude-opus-4-6-v1",
        body: JSON.stringify({
            anthropic_version: "bedrock-2023-05-31",
            max_tokens: 16000,
            thinking: {
                type: "adaptive"
            },
            messages: [{
                role: "user",
                content: "Explain why the sum of two even numbers is always even."
            }]
        })
    });

    const response = await client.send(command);
    const responseBody = JSON.parse(new TextDecoder().decode(response.body));

    for (const block of responseBody.content) {
        if (block.type === "thinking") {
            console.log(`\nThinking: ${block.thinking}`);
        } else if (block.type === "text") {
            console.log(`\nResponse: ${block.text}`);
        }
    }
}

main().catch(console.error);

使用工作量參數進行適應性思考

您可以將適應性思維與工作量參數結合起來，以引導思考Claude能力。工作量層級可做為思維配置Claude的軟性指引：

投入程度	思考行為
`max`	Claude 一律在思考深度方面沒有限制的情況下思考。 Claude Opus 僅限 4.6 — 在其他模型`max`上使用的請求將傳回錯誤。
`high` (default)	Claude 一律會思考。提供複雜任務的深入推理。
`medium`	Claude 使用中度思考。可能會略過思考非常簡單的查詢。
`low`	Claude 將思維降至最低。跳過思考速度最重要的簡單任務。

重要

effort 參數必須放置在請求內文中的個別output_config物件內，而非thinking物件內。放入 effort thinking 會導致 ValidationException。

下列範例示範如何在使用 InvokeModel API 時設定工作量層級：


{
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 16000,
    "thinking": {
        "type": "adaptive"
    },
    "output_config": {
        "effort": "high"
    },
    "messages": [{
        "role": "user",
        "content": "Your prompt here"
    }]
}

搭配 Converse API 使用自適應思維

使用 Converse API 時，請在內傳遞 thinking和 effort 參數additionalModelRequestFields。下列範例顯示具有預設工作量層級的適應性思維：


import boto3, json

bedrock_runtime = boto3.client(service_name='bedrock-runtime', region_name='us-east-2')

response = bedrock_runtime.converse(
    modelId="us.anthropic.claude-opus-4-6-v1",
    messages=[{
        "role": "user",
        "content": [{"text": "Explain why the sum of two even numbers is always even."}]
    }],
    additionalModelRequestFields={
        "thinking": {
            "type": "adaptive"
        }
    }
)

print(json.dumps(response["output"], indent=2, default=str))

若要指定工作量層級，請在中的個別output_config物件內新增 effort 欄位additionalModelRequestFields：


response = bedrock_runtime.converse(
    modelId="us.anthropic.claude-opus-4-6-v1",
    messages=[{
        "role": "user",
        "content": [{"text": "What is 2 + 2?"}]
    }],
    additionalModelRequestFields={
        "thinking": {
            "type": "adaptive"
        },
        "output_config": {
            "effort": "low"
        }
    }
)

提示快取

使用adaptive思維的連續請求會保留提示快取中斷點。不過，在 adaptive和 enabled/disabled 思維模式之間切換會中斷訊息的快取中斷點。無論模式變更為何，系統提示和工具定義都會保持快取。

調校思考行為

如果 Claude 的思考頻率高於或低於您想要的頻率，您可以在系統提示中新增指引：


Extended thinking adds latency and should only be used when it
will meaningfully improve answer quality — typically for problems
that require multi-step reasoning. When in doubt, respond directly.

警告

減少Claude思考頻率的轉向可能會降低受益於推理的任務品質。在將提示型調校部署至生產環境之前，測量對特定工作負載的影響。請考慮先以較低的工作量層級進行測試。

您的瀏覽器已停用或無法使用 Javascript。

您必須啟用 Javascript，才能使用 AWS 文件。請參閱您的瀏覽器說明頁以取得說明。

文件慣用形式

延伸思考

思考加密