View a markdown version of this page

Inference using Chat Completions API - Amazon Bedrock

Inference using Chat Completions API

The OpenAI Chat Completions API generates conversational responses using Amazon Bedrock models. You can use the Chat Completions API on both the bedrock-mantle and bedrock-runtime endpoints. We recommend using the bedrock-mantle endpoint whenever possible. For complete API details, see the OpenAI Chat Completions documentation.

Endpoint Base URL Authentication
bedrock-mantle (recommended) https://bedrock-mantle.{region}.api.aws/v1/chat/completions Amazon Bedrock API key or AWS credentials
bedrock-runtime https://bedrock-runtime.{region}.amazonaws.com/v1/chat/completions AWS credentials (SigV4) or Amazon Bedrock API key

Chat Completions with the bedrock-mantle endpoint

The bedrock-mantle endpoint supports Amazon Bedrock API key authentication and the OpenAI SDK.

List available models

To list models available on the bedrock-mantle endpoint, choose the tab for your preferred method, and then follow the steps:

OpenAI SDK (Python)
# List all available models using the OpenAI SDK # Requires OPENAI_API_KEY and OPENAI_BASE_URL environment variables from openai import OpenAI client = OpenAI() models = client.models.list() for model in models.data: print(model.id)
HTTP request
# List all available models # Requires OPENAI_API_KEY and OPENAI_BASE_URL environment variables curl -X GET $OPENAI_BASE_URL/models \ -H "Authorization: Bearer $OPENAI_API_KEY"

Create a chat completion

Choose the tab for your preferred method, and then follow the steps:

OpenAI SDK (Python)

Configure the OpenAI client using environment variables:

# Create a chat completion using the OpenAI SDK # Requires OPENAI_API_KEY and OPENAI_BASE_URL environment variables from openai import OpenAI client = OpenAI() completion = client.chat.completions.create( model="openai.gpt-oss-120b", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello!"} ] ) print(completion.choices[0].message)
HTTP request

Make a POST request to /v1/chat/completions:

# Create a chat completion # Requires OPENAI_API_KEY and OPENAI_BASE_URL environment variables curl -X POST $OPENAI_BASE_URL/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -d '{ "model": "openai.gpt-oss-120b", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello!"} ] }'
Streaming

To receive responses incrementally, choose the tab for your preferred method, and then follow the steps:

OpenAI SDK (Python)
# Stream chat completion responses incrementally using the OpenAI SDK # Requires OPENAI_API_KEY and OPENAI_BASE_URL environment variables from openai import OpenAI client = OpenAI() stream = client.chat.completions.create( model="openai.gpt-oss-120b", messages=[{"role": "user", "content": "Tell me a story"}], stream=True ) for chunk in stream: if chunk.choices[0].delta.content is not None: print(chunk.choices[0].delta.content, end="")
HTTP request

Make a POST request to /v1/chat/completions with stream set to true:

# Stream chat completion responses incrementally # Requires OPENAI_API_KEY and OPENAI_BASE_URL environment variables curl -X POST $OPENAI_BASE_URL/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -d '{ "model": "openai.gpt-oss-120b", "messages": [ {"role": "user", "content": "Tell me a story"} ], "stream": true }'

Chat Completions with the bedrock-runtime endpoint

The bedrock-runtime endpoint supports AWS SigV4 authentication and Amazon Bedrock API key authentication.

List available models

To list models available on the bedrock-runtime endpoint, choose the tab for your preferred method, and then follow the steps:

OpenAI SDK (Python)
from openai import OpenAI import os client = OpenAI( base_url="https://bedrock-runtime.us-east-1.amazonaws.com/v1", api_key=os.environ.get("AWS_BEARER_TOKEN_BEDROCK") ) models = client.models.list() for model in models.data: print(model.id)
HTTP request
curl -X GET "https://bedrock-runtime.us-east-1.amazonaws.com/v1/models" \ -H "Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK"

Create a chat completion

Choose the tab for your preferred method, and then follow the steps:

OpenAI SDK (Python)

Configure the OpenAI client to point to the bedrock-runtime endpoint:

from openai import OpenAI import os client = OpenAI( base_url="https://bedrock-runtime.us-east-1.amazonaws.com/v1", api_key=os.environ.get("AWS_BEARER_TOKEN_BEDROCK") ) response = client.chat.completions.create( model="us.anthropic.claude-sonnet-4-6", messages=[{"role": "user", "content": "Hello"}] ) print(response.choices[0].message.content)
HTTP request (API key)
curl -X POST "https://bedrock-runtime.us-east-1.amazonaws.com/v1/chat/completions" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK" \ -d '{ "model": "us.anthropic.claude-sonnet-4-6", "messages": [{"role": "user", "content": "Hello"}] }'
HTTP request (SigV4)
curl -X POST "https://bedrock-runtime.us-east-1.amazonaws.com/v1/chat/completions" \ -H "Content-Type: application/json" \ --aws-sigv4 "aws:amz:us-east-1:bedrock" \ --user "$AWS_ACCESS_KEY_ID:$AWS_SECRET_ACCESS_KEY" \ -d '{ "model": "us.anthropic.claude-sonnet-4-6", "messages": [{"role": "user", "content": "Hello"}] }'

For more details on supported models, Regions, and advanced features with the bedrock-runtime endpoint, see Chat Completions API (legacy reference).

Include a guardrail in a chat completion

To include safeguards in model input and responses, apply a guardrail when running model invocation by including the following extra parameters as fields in the request body:

  • extra_headers – Maps to an object containing the following fields, which specify extra headers in the request:

    • X-Amzn-Bedrock-GuardrailIdentifier (required) – The ID of the guardrail.

    • X-Amzn-Bedrock-GuardrailVersion (required) – The version of the guardrail.

    • X-Amzn-Bedrock-Trace (optional) – Whether or not to enable the guardrail trace.

  • extra_body – Maps to an object. In that object, you can include the amazon-bedrock-guardrailConfig field, which maps to an object containing the following fields:

For more information about these parameters in Amazon Bedrock Guardrails, see Test your guardrail.

To see examples of using guardrails with OpenAI chat completions, choose the tab for your preferred method, and then follow the steps:

OpenAI SDK (Python)
import openai from openai import OpenAIError # Endpoint for Amazon Bedrock Runtime bedrock_endpoint = "https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1" # Model ID model_id = "openai.gpt-oss-20b-1:0" # Replace with actual values bedrock_api_key = "$AWS_BEARER_TOKEN_BEDROCK" guardrail_id = "GR12345" guardrail_version = "DRAFT" client = openai.OpenAI( api_key=bedrock_api_key, base_url=bedrock_endpoint, ) try: response = client.chat.completions.create( model=model_id, # Specify guardrail information in the header extra_headers={ "X-Amzn-Bedrock-GuardrailIdentifier": guardrail_id, "X-Amzn-Bedrock-GuardrailVersion": guardrail_version, "X-Amzn-Bedrock-Trace": "ENABLED", }, # Additional guardrail information can be specified in the body extra_body={ "amazon-bedrock-guardrailConfig": { "tagSuffix": "xyz" # Used for input tagging } }, messages=[ { "role": "system", "content": "You are a helpful assistant." }, { "role": "assistant", "content": "Hello! How can I help you today?" }, { "role": "user", "content": "What is the weather like today?" } ] ) request_id = response._request_id print(f"Request ID: {request_id}") print(response) except OpenAIError as e: print(f"An error occurred: {e}") if hasattr(e, 'response') and e.response is not None: request_id = e.response.headers.get("x-request-id") print(f"Request ID: {request_id}")
OpenAI SDK (Java)
import com.openai.client.OpenAIClient; import com.openai.client.okhttp.OpenAIOkHttpClient; import com.openai.core.http.HttpResponseFor; import com.openai.models.chat.completions.ChatCompletion; import com.openai.models.chat.completions.ChatCompletionCreateParams; // Endpoint for Amazon Bedrock Runtime String bedrockEndpoint = "http://bedrock-runtime.us-west-2.amazonaws.com/openai/v1" // Model ID String modelId = "openai.gpt-oss-20b-1:0" // Replace with actual values String bedrockApiKey = "$AWS_BEARER_TOKEN_BEDROCK" String guardrailId = "GR12345" String guardrailVersion = "DRAFT" OpenAIClient client = OpenAIOkHttpClient.builder() .apiKey(bedrockApiKey) .baseUrl(bedrockEndpoint) .build() ChatCompletionCreateParams request = ChatCompletionCreateParams.builder() .addUserMessage("What is the temperature in Seattle?") .model(modelId) // Specify additional headers for the guardrail .putAdditionalHeader("X-Amzn-Bedrock-GuardrailIdentifier", guardrailId) .putAdditionalHeader("X-Amzn-Bedrock-GuardrailVersion", guardrailVersion) // Specify additional body parameters for the guardrail .putAdditionalBodyProperty( "amazon-bedrock-guardrailConfig", JsonValue.from(Map.of("tagSuffix", JsonValue.of("xyz"))) // Allows input tagging ) .build(); HttpResponseFor<ChatCompletion> rawChatCompletionResponse = client.chat().completions().withRawResponse().create(request); final ChatCompletion chatCompletion = rawChatCompletionResponse.parse(); System.out.println(chatCompletion);