適応的思考の仕組みエフォートパラメータを使用した適応的思考 Converse API での適応的思考の使用プロンプトキャッシュ思考動作の調整

アダプティブシンキング

アダプティブシンキングは、を 4.6 Claude Opus 拡張思考で使用するための推奨方法です。アダプティブシンキングでは、シンキングトークンの予算を手動で設定する代わりに、各リクエストの複雑さに基づいて、いつ、どの程度考えるかをClaude動的に決定できます。アダプティブシンキングはbudget_tokens、固定を使用した拡張シンキングよりも確実にパフォーマンスを向上させます。4.6 Claude Opus から最もインテリジェントなレスポンスを得るために、アダプティブシンキングに移行することをお勧めします。ベータヘッダーは必要ありません。

サポートされているモデルは次のとおりです。

モデル	モデル ID
Claude Opus 4.7	`anthropic.claude-opus-4-7`
Claude Mythos プレビュー	`anthropic.claude-mythos-preview`
Claude Opus 4.6	`anthropic.claude-opus-4-6-v1`
Claude Sonnet 4.6	`anthropic.claude-sonnet-4-6`

注記

Claude Opus 4.7 と Claude Mythos Preview は適応的思考のみをサポートします。手動拡張思考 (thinking.type: "enabled" と budget_tokens) は、これらのモデルではサポートされておらず、400 エラーを返します。

thinking.type: "enabled" および budget_tokensは 4Claude Opus.6 および Claude Sonnet 4.6 で非推奨となり、今後のモデルリリースで削除されます。代わりに、エフォートパラメータthinking.type: "adaptive"でを使用します。

古いモデル (Claude Sonnet 4.5、4.5 など) Claude Opus は適応的思考をサポートしておらず、 thinking.type: "enabled"でが必要ですbudget_tokens。

適応的思考の仕組み

アダプティブモードでは、は各リクエストの複雑さClaudeを評価し、考えるかどうか、どの程度考えるかを決定します。デフォルトのエフォートレベル (high) では、 Claudeはほとんど常に考えます。労力レベルが低い場合、 Claudeはより単純な問題に対する思考をスキップする可能性があります。

アダプティブシンキングでは、も自動的に有効になりますインターリーブ思考 (ベータ版)。つまり、 Claudeはツール呼び出し間で考えることができるため、エージェントワークフローに特に効果的です。

API リクエスト"adaptive"で thinking.typeをに設定します。

CLI


aws bedrock-runtime invoke-model \
--model-id "us.anthropic.claude-opus-4-6-v1" \
--body '{
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 16000,
"thinking": {
"type": "adaptive"
},
"messages": [
{
"role": "user",
"content": "Three players A, B, C play a game. Each has a jar with 100 balls numbered 1-100. Simultaneously, each draws one ball. A beats B if As number > Bs number (mod 100, treating 100 as 0 for comparison). Similarly for B vs C and C vs A. The overall winner is determined by majority of pairwise wins (ties broken randomly). Is there a mixed strategy Nash equilibrium where each player draws uniformly? If not, characterize the equilibrium."
}
]
}' \
--cli-binary-format raw-in-base64-out \
output.json && cat output.json | jq '.content[] | {type, thinking: .thinking[0:200], text}'

Python


import boto3
import json

bedrock_runtime = boto3.client(
    service_name='bedrock-runtime',
    region_name='us-east-2'
)

response = bedrock_runtime.invoke_model(
    modelId="us.anthropic.claude-opus-4-6-v1",
    body=json.dumps({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 16000,
        "thinking": {
            "type": "adaptive"
        },
        "messages": [{
            "role": "user",
            "content": "Explain why the sum of two even numbers is always even."
        }]
    })
)

response_body = json.loads(response["body"].read())

for block in response_body["content"]:
    if block["type"] == "thinking":
        print(f"\nThinking: {block['thinking']}")
    elif block["type"] == "text":
        print(f"\nResponse: {block['text']}")

TypeScript


import { BedrockRuntimeClient, InvokeModelCommand } from "@aws-sdk/client-bedrock-runtime";

async function main() {
    const client = new BedrockRuntimeClient({});

    const command = new InvokeModelCommand({
        modelId: "us.anthropic.claude-opus-4-6-v1",
        body: JSON.stringify({
            anthropic_version: "bedrock-2023-05-31",
            max_tokens: 16000,
            thinking: {
                type: "adaptive"
            },
            messages: [{
                role: "user",
                content: "Explain why the sum of two even numbers is always even."
            }]
        })
    });

    const response = await client.send(command);
    const responseBody = JSON.parse(new TextDecoder().decode(response.body));

    for (const block of responseBody.content) {
        if (block.type === "thinking") {
            console.log(`\nThinking: ${block.thinking}`);
        } else if (block.type === "text") {
            console.log(`\nResponse: ${block.text}`);
        }
    }
}

main().catch(console.error);

エフォートパラメータを使用した適応的思考

適応的思考と労力パラメータを組み合わせて、思考の量を導くことができますClaude。労力レベルは、 Claudeの思考割り当てのソフトガイダンスとして機能します。

労力レベル	思考動作
`max`	Claude は常に思考深度に制約を課さずに思考します。 Claude Opus 4.6 のみ — 他のモデル`max`でを使用するリクエストはエラーを返します。
`high` (デフォルト)	Claude は常に考えます。複雑なタスクに関する深い推論を提供します。
`medium`	Claude は中程度の思考を使用します。非常に単純なクエリの思考をスキップすることがあります。
`low`	Claude は思考を最小限に抑えます。速度が最も重要である単純なタスクの思考をスキップします。

重要

effort パラメータは、output_configオブジェクト内ではなく、リクエスト本文内の別のthinkingオブジェクト内に配置する必要があります。effort 内部に配置するとthinking、になりますValidationException。

次の例は、InvokeModel API を使用するときに労力レベルを設定する方法を示しています。


{
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 16000,
    "thinking": {
        "type": "adaptive"
    },
    "output_config": {
        "effort": "high"
    },
    "messages": [{
        "role": "user",
        "content": "Your prompt here"
    }]
}

Converse API での適応的思考の使用

Converse API を使用する場合は、 thinking および effortパラメータを内に渡しますadditionalModelRequestFields。次の例は、デフォルトの労力レベルでの適応的思考を示しています。


import boto3, json

bedrock_runtime = boto3.client(service_name='bedrock-runtime', region_name='us-east-2')

response = bedrock_runtime.converse(
    modelId="us.anthropic.claude-opus-4-6-v1",
    messages=[{
        "role": "user",
        "content": [{"text": "Explain why the sum of two even numbers is always even."}]
    }],
    additionalModelRequestFields={
        "thinking": {
            "type": "adaptive"
        }
    }
)

print(json.dumps(response["output"], indent=2, default=str))

エフォートレベルを指定するには、の別のoutput_configオブジェクト内に effortフィールドを追加しますadditionalModelRequestFields。


response = bedrock_runtime.converse(
    modelId="us.anthropic.claude-opus-4-6-v1",
    messages=[{
        "role": "user",
        "content": [{"text": "What is 2 + 2?"}]
    }],
    additionalModelRequestFields={
        "thinking": {
            "type": "adaptive"
        },
        "output_config": {
            "effort": "low"
        }
    }
)

プロンプトキャッシュ

adaptive 思考を使用した連続リクエストは、プロンプトキャッシュのブレークポイントを保持します。ただし、 adaptiveと enabled/disabled 思考モードを切り替えると、メッセージのキャッシュブレークポイントが壊れます。システムプロンプトとツール定義は、モードの変更に関係なくキャッシュされたままになります。

思考動作の調整

Claude が目的よりも頻繁に考えている場合は、システムプロンプトにガイダンスを追加できます。


Extended thinking adds latency and should only be used when it
will meaningfully improve answer quality — typically for problems
that require multi-step reasoning. When in doubt, respond directly.

警告

より少ない頻度で考えるClaudeようにステアリングすると、推論の恩恵を受けるタスクの品質が低下する可能性があります。プロンプトベースのチューニングを本番環境にデプロイする前に、特定のワークロードへの影響を測定します。まず、労力レベルを低くしてテストすることを検討してください。

ブラウザで JavaScript が無効になっているか、使用できません。

AWS ドキュメントを使用するには、JavaScript を有効にする必要があります。手順については、使用するブラウザのヘルプページを参照してください。

ドキュメントの表記規則

拡張思考

思考の暗号化