Monitor bedrock-runtime inference using CloudWatch metrics
The Amazon Bedrock bedrock-runtime.
endpoint publishes metrics to Amazon CloudWatch under the region.amazonaws.com.rproxy.govskope.usAWS/Bedrock namespace. Use these
metrics to monitor invocation volume, latency, token consumption, error rates, and model
invocation logging delivery.
If your application calls inference through
bedrock-mantle., see Monitor bedrock-mantle inference using CloudWatch metrics instead.region.api.aws
Amazon Bedrock runtime metrics
The following table describes runtime metrics provided by Amazon Bedrock.
| Metric name | Unit | Description |
|---|---|---|
| Invocations | SampleCount | Number of successful requests to the Converse, ConverseStream, InvokeModel, and InvokeModelWithResponseStream API operations. |
|
InvocationLatency |
MilliSeconds |
The time from when a request is sent to when the last token is received. To distinguish latency increases caused by service-side throughput changes from increases caused by longer model responses, see Diagnose InvocationLatency increases using output tokens per second (OTPS). |
|
InvocationClientErrors |
SampleCount |
Number of invocations that result in client-side errors. |
|
InvocationServerErrors |
SampleCount |
Number of invocations that result in AWS server-side errors. |
|
InvocationThrottles |
SampleCount |
Number of invocations that the system throttled. Throttled requests and other invocation errors don't count as either Invocations or Errors. The number of throttles you see will depend on your retry settings in the SDK. For more information, see Retry behavior in the AWS SDKs and Tools Reference Guide. |
|
InputTokenCount |
SampleCount |
Number of tokens in the input. |
| LegacyModelInvocations | SampleCount | Number of invocations using Legacy models |
|
OutputTokenCount |
SampleCount |
Number of tokens in the output. |
|
OutputImageCount |
SampleCount |
Number of images in the output (only applicable for image generation models). |
|
TimeToFirstToken |
MilliSeconds |
Time from when a request is sent to when the first token is received, for the ConverseStream and InvokeModelWithResponseStream streaming API operations. |
|
EstimatedTPMQuotaUsage |
SampleCount |
Estimated Tokens Per Minute (TPM) quota consumption across the Converse, ConverseStream, InvokeModel, and InvokeModelWithResponseStream API operations. This metric is an approximation and does not reflect the reservation-based token consumption that drives throttling decisions. Throttling is based on the upfront reservation of input tokens plus |
|
CacheReadInputTokens |
SampleCount |
Number of input tokens read from the prompt cache. These tokens are charged at a reduced rate and don't count toward your TPM quota. |
|
CacheWriteInputTokens |
SampleCount |
Number of input tokens written to the prompt cache. These tokens count toward your TPM quota. |
There are also metrics for Amazon Bedrock Guardrails and Amazon Bedrock Agents.
Model invocation logging CloudWatch metrics
For each delivery success or failure attempt, the following Amazon CloudWatch metrics are emitted
under the namespace AWS/Bedrock, and Across all model IDs
dimension:
-
ModelInvocationLogsCloudWatchDeliverySuccess -
ModelInvocationLogsCloudWatchDeliveryFailure -
ModelInvocationLogsS3DeliverySuccess -
ModelInvocationLogsS3DeliveryFailure -
ModelInvocationLargeDataS3DeliverySuccess -
ModelInvocationLargeDataS3DeliveryFailure
To retrieve metrics for your Amazon Bedrock operations, you specify the following information:
-
The metric dimension. A dimension is a set of name-value pairs that you use to identify a metric. Amazon Bedrock supports the following dimensions:
-
ModelId– all metrics -
ModelId + ImageSize + BucketedStepSize– OutputImageCount
-
-
The metric name, such as
InvocationClientErrors.
You can get metrics for Amazon Bedrock with the AWS Management Console, the AWS CLI, or the CloudWatch API. You can use the CloudWatch API through one of the AWS Software Development Kits (SDKs) or the CloudWatch API tools.
To view Amazon Bedrock metrics in the CloudWatch console, go to the metrics section in the navigation pane and select the all metrics option, then search for the model ID.
You must have the appropriate CloudWatch permissions to monitor Amazon Bedrock with CloudWatch For more information, see Authentication and Access Control for Amazon CloudWatch in the Amazon CloudWatch User Guide.