

# Monitor the operational health of your applications with Application Signals
<a name="Services"></a>

Use Application Signals within the [CloudWatch console](https://console.aws.amazon.com/cloudwatch/) to monitor and troubleshoot the operational health of your applications:
+ **Monitor your application services** — As part of daily operational monitoring, use the [Services](Services-page.md) page to see a summary of all your services. See services with the highest fault rate or latency, and see which services have unhealthy [service level indicators (SLIs)](CloudWatch-ServiceLevelObjectives.md). Select a service to open the [Service detail](ServiceDetail.md) page and see detailed metrics, service operations, Synthetics canaries, and client requests. This can help you troubleshoot and identify the root cause of operational issues. 
+ **Inspect your application topology** — Use the [Application Map](ServiceMap.md) to understand and monitor your application topology over time, including the relationships between clients, Synthetics canaries, services, and dependencies. Instantly see service level indicator (SLI) health and view key metrics such as call volume, fault rate, and latency. Drill down to see more detailed information in the [Service detail](ServiceDetail.md) page.

Explore an [example scenario](Services-example-scenario.md) that demonstrates how these pages can be used to quickly troubleshoot an operational service health issue, from initial detection to identifying root cause.

**How Application Signals enables operational health monitoring**

After you [enable your application](CloudWatch-Application-Signals-Enable.md) for Application Signals, your application services, APIs, and their dependencies are automatically discovered and displayed in the **Services**, **Service detail**, and **Application Map** pages. Application Signals collects information from multiple sources to enable service discovery and operational health monitoring: 
+ [AWS Distro for OpenTelemetry (ADOT)](CloudWatch-Application-Signals-supportmatrix.md) — As part of enabling Application Signals, OpenTelemetry Java and Python auto-instrumentation libraries are configured to emit metrics and traces that are collected by the CloudWatch agent. The metrics and traces are used to enable discovery of services, operations, dependencies, and other service information.
+ [Service-level objectives (SLOs)](CloudWatch-ServiceLevelObjectives.md) — After you create service level objectives for your services, the Services, Service detail, and Application Map pages display service level indicator (SLI) health. SLIs can monitor latency, availability, and other operational metrics.
+ [CloudWatch Synthetics canaries](CloudWatch_Synthetics_Canaries.md) — When you configure X-Ray tracing on your canaries, calls to your services from your canary scripts are associated with your service and displayed within the Service detail page.
+ [CloudWatch Real user monitoring (RUM)](CloudWatch-RUM.md) — When X-Ray tracing is enabled on your CloudWatch RUM web client, requests to your services are automatically associated and displayed within the service detail page.
+ [AWS Service Catalog AppRegistry](https://docs.aws.amazon.com/servicecatalog/latest/arguide/intro-app-registry.html) — Application Signals auto-discovers AWS resources within your account and allows you to group them into logical applications created in AppRegistry. The application name displayed in the Services page is based on the underlying compute resource that your services are running on.

**Note**  
Application Signals displays your services and operations based on metrics and traces emitted within the current time filter that you chose. (By default, this is the past three hours.) If there is no activity within the current time filter for a service, operation, dependency, Synthetics canary, or client page, it won't be displayed.   
Up to 1,000 services can be displayed. Discovery of your services and service topology might be delayed up to 10 minutes. Evaluation of your service level indicator (SLI) health might be delayed up to 15 minutes. 

**Note**  
Application Signals console currently only supports choosing a maximum of one day within the 30 days time range.

# View overall service activity and operational health with the Services page
<a name="Services-page"></a>

Use the Services page to see a list of your services that are [enabled for Application Signals](CloudWatch-Application-Signals-Enable.md). You can also view operational metrics and quickly see which services have unhealthy service level indicators (SLIs). Drill down to look for performance anomalies as you identify the root cause of operational issues. To view this page, open the [CloudWatch console](https://console.aws.amazon.com/cloudwatch/) and choose **Services** under the **Application Signals** section in the left navigation pane.

For un-instrumented services, the Service overview page displays limited information with prominent calls-to-action to enable Application Signals instrumentation.

## Explore operational health metrics for your services
<a name="services-top-graphs"></a>

The top of the Services page includes an overall service operational health graph and several tables displaying top services and service dependencies by fault rate and list of services. The Services graph on the left displays a breakdown of the number of services that have healthy or unhealthy service level indicators (SLIs) during the current page-level time filter. SLIs can monitor latency, availability, and other operational metrics. View the top services by fault rate in the two tables next to the graph. Select a service name in either table to open its [service detail page](ServiceDetail.md) page, which displays detailed service operation information. Select a dependency path to view service dependency details on its detail page.

Both tables display information for up to the past three hours, even if a longer time period filter is chosen at the top right of the page.

When using dynamic service grouping, the operational health metrics automatically aggregates data across all services within each group. This provides:
+ Consolidated fault rates for service groups
+ Group-level SLI health status
+ Aggregated performance metrics that help identify problematic service clusters
+ Quick identification of which groups require immediate attention during incidents

![\[CloudWatch Services top graphs\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/services-top-graphs.png)


## Monitor operational health with the Services table
<a name="services-table"></a>

The Services table displays a list of your services that have been enabled for Application Signals. Choose **Enable Application Signals** to open a setup page and start configuring your services. For more information, see [Enable Application Signals](CloudWatch-Application-Signals-Enable.md). 

Filter the Services table to make it easier to find what you're looking for, by choosing one or more properties from the filter text box. As you choose each property, you are guided through filter criteria. You will see the complete filter below the filter text box. Choose **Clear filters** at any time to remove the table filter. 

The advanced filtering options allows you to:
+ Filter by service groups (both default and custom groupings)
+ Filter by recent deployment activity
+ Filter by Platform
+ Filter by SLI Health
+ Filter by Account ID (in cross-account observability setups)
+ Filter by instrumentation status (instrumented vs un-instrumented)
+ Filter by environment
+ Filter by service health status

![\[CloudWatch Services table\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/services-table-healthy-updated.png)


For un-instrumented services, the Service overview page displays limited information with prominent calls-to-action to enable Application Signals instrumentation. Un-instrumented services appear in the Services table even when they haven't been configured with Application Signals, helping you identify gaps in your observability coverage and prioritize which services to instrument next based on their position in your architecture.

Choose the name of any service in the table to view a [service detail page](ServiceDetail.md) containing service-level metrics, operations, and additional details. If you have associated the service's underlying compute resource with an application in AppRegistry or the Applications card on the AWS Management Console home page, choose the application name to display the application details in the [myApplications](https://docs.aws.amazon.com/awsconsolehelpdocs/latest/gsg/aws-myApplications.html) console page. For services hosted in Amazon EKS, choose any link within the **Hosted in** column to view Cluster, Namespace, or Workload within CloudWatch Container Insights. For services running on Amazon ECS or Amazon EC2, the Environment value is shown. 

[Service level indicator (SLI)](CloudWatch-ServiceLevelObjectives.md#CloudWatch-ServiceLevelObjectives-concepts) status is displayed for each service in the table. Choose the SLI status for a service to display a pop-up containing a link to any unhealthy SLIs, and a link to see all SLOs for the service. 

![\[Service with unhealthy SLI\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/services-unhealthy-sli.png)


If no SLOs have been created for a service, choose the **Create SLO** button within the **SLI Status** column. To create additional SLOs for any service, select the option button next to the service name, and then choose **Create SLO** at the top-right of the table. When you create SLOs, you can see at a glance which of your services and operations are performing well and which are unhealthy. See [service level objectives (SLOs)](CloudWatch-ServiceLevelObjectives.md) for more information. 

## Service overview
<a name="services-overview"></a>

After you select a service from the Services table, the Service overview page opens. This page provides a comprehensive view of your service's operational health and performance metrics. The overview displays these summary metrics:
+ Total operations
+ Service dependencies
+ Canary monitoring status
+ RUM client data

These metrics give you immediate insight into your service's current state.

You can visualize key operational performance indicators over time using a series of charts. To analyze trends and identify potential issues affecting your service health, adjust the time filter. All charts automatically update to reflect data for the selected time period.

The Audit findings section automatically detects and shows critical problems in your service's behavior, so you don't need to investigate manually. Application Signals analyzes your applications to report significant observations and potential problems, simplifying root cause analysis. These automated findings consolidate relevant traces, eliminating the need to navigate through multiple clicks. The audit system helps teams quickly identify issues and their underlying causes, enabling faster problem resolution.

You can use the Change events section to identify how recent deployments or configuration changes affect your service behavior. Application Signals automatically processes CloudTrail events to track change events across your application. Monitor configuration and deployment events for services and their dependencies, providing immediate context for operational analysis and troubleshooting. Application Signals automatically correlates deployment times with performance changes, helping you quickly identify if recent deployments contributed to service issues.

![\[Service overview\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/Service_detail.png)


# View detailed service activity and operational health with the service detail page
<a name="ServiceDetail"></a>

When you instrument your application, [Amazon CloudWatch Application Signals](CloudWatch-Application-Monitoring-Sections.md) maps all of the services that your application discovers. Use the service detail page to see an overview of your services, operations, dependencies, canaries, and client requests for a single service. To view the service detail page, do the following:
+ Open the [CloudWatch console](https://console.aws.amazon.com/cloudwatch/).
+ Choose **Services** under the **Application Signals** section in the left navigation pane.
+ Choose the name of any service from the **Services**, **Top services**, or dependency tables.

Under **schedule-visits**, you will see the account label and ID under the service name.

The service detail page is organized into the following tabs:
+  [Overview](#ServiceDetail-overview) — Use this tab to see an overview of a single service, including the number of operations, dependencies, synthetics, and client pages. The tab shows key metrics for your entire service, top operations and dependencies. These metrics include time series data on latency, faults, and errors across all service operations for that service.
+  [Service operations](#ServiceDetail-operations) — Use this tab to see a list of the operations that your service calls and interactive graphs with key metrics that measure the health of each operation. You can select a data point in a graph to obtain information about traces, logs, or metrics associated with that data point.
+  [Dependencies](#ServiceDetail-dependencies) — Use this tab to see a list of dependencies that your service calls, and a list of metrics for those dependencies.
+  [Synthetics canaries](#ServiceDetail-canaries) — Use this tab to see a list of synthetics canaries that simulate user calls to your service, and key performance metrics for how those canaries. 
+  [Client pages](#ServiceDetail-clientpages) — Use this tab to see a list of client pages that call your service, and metrics that measure the quality of client interactions with your application. 
+  [Related metrics](#ServiceDetail-relatedmetrics) — Use this tab to correlate related metrics, such as standard metrics, runtime metrics, and custom metrics for a service, it's operations or dependencies.

## View your service overview
<a name="ServiceDetail-overview"></a>

Use the service overview page to view a high-level summary of metrics for all service operations in a single location. Check the performance of all the operations, dependencies, client pages and synthetics canaries that interact with your application. Use this information to help you determine where to focus efforts to identify issues, troubleshoot errors, and find opportunities for optimization.

Choose any link in **Service Details** to view information that is related to a specific service. For example, for services hosted in Amazon EKS, the service details page shows **Cluster**, **Namespace**, and **Workload** information. For services hosted in Amazon ECS or Amazon EC2, the service details page shows the **Environment** value.

Under **Services**, the **Overview** tab displays a summary of the following:
+ Operations – Use this tab to see the health of your service operations. The health status is determined by service level indicators (SLI) that are defined as a part of a [service level objective](CloudWatch-ServiceLevelObjectives.md) (SLO).
+ Dependencies – Use this tab to see the top dependencies of the services called by your application, listed by fault rate and to see the health of your service dependencies. The health status is determined by service level indicators (SLI) that are defined as a part of a service level objective (SLO).
+ Synthetics canaries – Use this tab to see the result of simulated calls to endpoints or APIs associated with your service, and the number of failed canaries.
+ Client pages – Use this tab to see top pages called by clients that have asynchronous JavaScript and XML (AJAX) errors.

The following illustration shows an overview of your services:

![\[Service overview widgets\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/service-detail-widgets.png)


The **Overview** tab also displays a graph of dependencies with the highest latency across all services. Use the **p99**, **p90** and **p50** latency metrics to quickly assess which dependencies are contributing to your total service latency, as follows:

![\[Service operations latency graph\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/service-detail-latency.png)


For example, the previous graph shows that 99% of the requests made to the **customer-service** dependency were completed in approximately 4,950 milliseconds. The other dependencies took less time.

Graphs displaying the top four service operations by latency show the volume of requests, availability, fault rate, and error rate for those services, as shown in the following image:

![\[Service operations volume, availability, fault rate, and error rate graphs\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/service-detail-operations-graphs.png)


The **Service details** section displays the details of the service including the **Account ID** and **Account label**.

## View your service operations
<a name="ServiceDetail-operations"></a>

When you instrument your application, [Application Signals](CloudWatch-Application-Monitoring-Sections.md) discovers all of the service operations that your application calls. Use the **Service operations** tab to see a table that contains the service operations and a set of metrics that measure the performance of a selected operation. These metrics include SLI status, number of dependencies, latency, volume, faults, errors, and availability, as shown in the following image:

![\[Service operations table\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/service-operations-table.png)


Filter the table to make it easier to find a service operation by choosing one or more properties from the filter text box. As you choose each property, you are guided through filter criteria and will see the complete filter below the filter text box. Choose **Clear filters** at any time to remove the table filter. 

Choose the SLI status for an operation to display a popup containing a link to any unhealthy SLI, and a link to see all SLOs for the operation, as shown in the following table:

![\[Service operation SLI status\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/service-operation-unhealthy-slo.png)


The service operations table lists the SLI status, the number of healthy or unhealthy SLIs, and the total number of SLOs for each operation.

Use SLIs to monitor latency, availability, and other operational metrics that measure the operational health of a service. Use an SLO to check the performance and health status of your services and operations.

To create an SLO, do the following:
+ If an operation does not have an SLO, choose the **Create SLO** button within the **SLI Status** column.
+ If an operation already has an SLO, do the following:
  + Select the radio button next to the operation name.
  + Choose **Create SLO** from the **Actions** down arrow at the top right of the table.

For more information, see [service level objectives (SLOs)](CloudWatch-ServiceLevelObjectives.md).

The **Dependencies** column shows the number of dependencies this operation calls. Choose this number to open the **Dependencies** tab filtered to the selected operation.

### View service operations metrics, correlated traces, and application logs
<a name="ServiceDetail-traces"></a>

Application Signals correlates service operation metrics with AWS X-Ray traces, CloudWatch [Container Insights](ContainerInsights.md), and application logs. Use these metrics to troubleshoot operational health issues. To view metrics as graphical information, do the following:

1. Select a service operation in the **Service operations** table to see a set of graphs for the selected operation above the table with metrics for **Volume and Availability**, **Latency**, and **Faults and Errors**.

1. Hover over a point in a graph to view more information.

1. Select a point to open a diagnostic pane that shows correlated traces, metrics, and application logs for the selected point in the graph.

The following image shows the tooltip that appears after hovering over a point in the graph, and the diagnostic pane which appears after clicking on a point. The tooltip contains information about the associated data point in the **Faults and Errors** graph. The pane contains **Correlated traces**, **Top contributors**, and **Application logs** associated with the selected point.

![\[Correlated traces for faults and errors\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/service-detail-correlated-traces.png)


#### Correlated traces
<a name="ServiceDetail-traces-correlated"></a>

Look at related traces to understand an underlying issue with a trace. You can check to see if correlated traces or any service nodes associated with them behave similarly. To examine correlated traces, choose a **Trace ID** from the **Correlated traces** table to open the [X-Ray trace details](https://docs.aws.amazon.com/xray/latest/devguide/xray-console-traces.html) page for the chosen trace. The trace details page contains a map of service nodes that are associated with the selected trace and a timeline of trace segments.

#### Top contributors
<a name="ServiceDetail-traces-top-contributors"></a>

View the top contributors to find main input sources to a metric. Group contributors by different components to look for similarities within the group and understand how trace behavior differs between them.

The **Top contributors** tab gives metrics for **Call volume**, **Availability**, **Avg latency**, **Errors**, and **Faults** for each group. The following example image shows top contributors to a suite of metrics for an application deployed on an Amazon EKS platform:

![\[Service operation top contributors\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/service-operations-top-contributors.png)


The top contributors contains the following metrics:
+ **Call volume** - Use the call volume to understand the number of requests per time interval for a group.
+ **Availability** - Use availability to see what percentage of time that no faults were detected for a group.
+ **Avg latency** - Use latency to check the average time that requests ran for a group over a time interval that depends on how long ago the requests that you are investigating were made. Requests that were made less than 15 days prior are evaluated over 1 minute intervals. Requests that were made between 15 and 30 days prior, inclusive, are evaluated over 5 minute intervals. For example, if you are investigating requests that caused a fault 15 days ago, the call volume metric is equal to the number of requests per 5 minute interval.
+ **Errors** - The number of errors per group measured over a time interval.
+ **Faults** - The number of faults per group over a time interval.

**Top contributors using Amazon EKS or Kubernetes**

Use information about the top contributors for applications deployed on Amazon EKS or Kubernetesto see operational health metrics grouped by **Node**, **Pod** and **PodTemplateHash**. The following definitions apply:
+ A **pod** is a group of one or more Docker containers that share storage and resources. A pod is the smallest unit that can be deployed on a Kubernetes platform. Group by pods to check if errors are related to pod-specific limitations.
+ A **node** is a server that runs pods. Group by nodes to check if errors are related to node-specific limitations.
+ A **pod template hash** is used to find a particular version of a deployment. Group by pod template hash to check if errors are related to a particular deployment.

**Top contributors using Amazon EC2**

Use information about the top contributors for applications deployed on Amazon EKS to see operational health metrics grouped by instance ID, and auto scaling group. The following definitions apply:
+ An **Instance ID** is a unique identifier for the Amazon EC2 instance that your service runs. Group by instance ID to check if errors are related to a specific Amazon EC2 instance.
+ An [auto scaling group](https://docs.aws.amazon.com/autoscaling/ec2/userguide/auto-scaling-groups.html) is a collection of Amazon EC2 instances that allow you to scale up or down the resources you need to serve your application requests. Group by auto scaling group if you want to check if errors are limited in scope to the instances inside the group.

**Top contributors using a custom platform**

Use information about the top contributors for applications deployed using custom instrumentation to see operational health metrics grouped by **Host name**. The following definitions apply:
+ A host name identifies a device such as an endpoint or Amazon EC2 instance that is connected to a network. Group by host name to check if your errors are related to a specific physical or virtual device.

**View top contributors in Log Insights and Container Insights**

View and modify the automatic query that generated metrics for your top contributors in [Log Insights](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AnalyzingLogData.html). View infrastructure performance metrics by specific groups such as pods or nodes in [Container Insights](ContainerInsights.md). You can sort clusters, nodes or workloads by resource consumption and quickly identify anomalies or and mitigate risks pro-actively before end user experience is impacted. An image showing how to select these options follows:

![\[Top contributors table\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/service-operations-top-contributors-insights.png)


In **Container Insights**, you can view metrics for your Amazon EKS or Amazon ECS container that are specific to the grouping of your top contributors. For example, if you grouped by pod for an EKS container to generate top contributors, container insights will show metrics and statistics filtered for your pod.

In **Log Insights**, you can modify the query that generated the metrics under **Top contributors** using the following steps:

1. Select **View in Log Insights**. The **Logs Insights** page that opens contains an query that is automatically generated and contains the following information:
   + The log cluster group name.
   + The operation that you were investigating with CloudWatch.
   + The aggregate of the operational health metric interacted with on the graph.

   The log results are automatically filtered to show data from the last five minutes before you selected the data point on the service graph.

1. To edit the query, replace the generated text with your changes. You can also use the **Query generator** to help you generate a new query, or update the existing query.

#### Application logs
<a name="ServiceDetail-traces-application-logs"></a>

Use the query in the **Application logs** tab to generate logged information for your current log group, service and insert a timestamp. A log group is a group of log streams that you can define when you configure your application.

Use a log group to organize logs with similar characteristics including the following:
+ Capture logs from a specific organization, source or function.
+ Capture logs that are accessed by a particular user.
+ Capture logs for a specific time period.

Use these log streams to track specific groups or time frames. You can also set up monitoring rules, alarms and notifications for these log groups. For more information about log groups, see [Working with log groups and log streams](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/Working-with-log-groups-and-streams.html).

The application logs query returns the logs, recurring text patterns and graphical visualizations for your log groups.

To run the query, select **Run query in Logs Insights** to either run the automatically generated query or modify the query. To edit the query, replace the automatically generated text with your changes. You can also use the **Query generator** to help you generate a new query or update the existing query.

The following image shows the sample query that is automatically generated based on the selected point in the service operations graph:

![\[Application logs table\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/service-operations-application-logs.png)


In the preceding image, CloudWatch has automatically detected the log group that is associated with your selected point, and included it in a generated query.

## View your service dependencies
<a name="ServiceDetail-dependencies"></a>

Choose the **Dependencies** tab to display the **Dependencies** table and a set of metrics for the dependencies of all service operations or a single operation. The table contains a list of dependencies discovered by Application Signals, including metrics for SLI status, latency, call volume, fault rate, error rate, and availability.

At the top of the page, choose an operation from the down arrow list to view its dependencies, or choose **All** to see dependencies for all operations. 

Filter the table to make it easier to find what you're looking for, by choosing one or more properties from the filter text box. As you choose each property, you are guided through filter criteria and will see the complete filter below the filter text box. Choose **Clear filters** at any time to remove the table filter. Select **Group by Dependency** at the top right of the table to group dependencies by service and operation name. When grouping is turned on, expand or collapse a group of dependencies with the **\$1** icon next to the dependency name. 

![\[Dependencies table\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/service-dependencies-table.png)


The **Dependency** column displays the dependency service name, while the **Remote Operation** column displays the service operation name. The **SLI status** column displays the number of healthy or unhealthy SLIs along with the total number of SLIs for each dependency. When calling AWS services, the **Target** column displays the AWS resource, such as DynamoDB table or Amazon SNS queue.

To select a dependency, select the option next to a dependency in the **Dependencies** table. This shows a set of graphs that display detailed metrics for call volume, availability, faults, and errors. Hover over a point in a graph to see a popup containing more information. Select a point in a graph to open a diagnostic pane that shows correlated traces for the selected point in the graph. Choose a trace ID from the **Correlated traces** table to open the [X-Ray Trace details](https://docs.aws.amazon.com/xray/latest/devguide/xray-console-traces.html) page for the selected trace.

![\[Dependency graphs and correlated traces\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/service-dependency-graph-traces.jpg)


## View your Synthetics canaries
<a name="ServiceDetail-canaries"></a>

Choose the **Synthetics Canaries** tab to display the **Synthetics Canaries** table, and a set of metrics for each canary in the table. The table includes metrics for success percentage, average duration, runs, and failure rate. Only canaries that are [enabled for AWS X-Ray tracing](CloudWatch_Synthetics_Canaries_tracing.md) are displayed.

Use the filter text box in the synthetics canaries table to find the canary that you are interested in. Each filter that you create appears below the filter text box. Choose **Clear filters** at any time to remove the table filter. 

![\[Synthetics canaries table\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/service-canaries-table.png)


Select the radio button next to the name of the canary to see a set of tabs containing graphs detailed metrics including success percentage, errors and duration. Hover over a point in a graph to see a popup containing more information. Select a point in a graph to open a diagnostic pane that shows canary runs that correlate to the selected point. Select a canary run and choose the **Run time** to see artifacts for your selected canary run including logs, HTTP Archive (HAR) files, screenshots, and suggested steps to help you troubleshoot problems. Choose **Larn more** to open the [CloudWatch Synthetics Canaries](CloudWatch_Synthetics_Canaries.md) page next to **Canary runs**.

![\[Synthetics canary graphs and runs\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/service-canary-graphs-runs.jpg)


## View your client pages
<a name="ServiceDetail-clientpages"></a>

Choose the **Client pages** tab to display a list of client web pages that call your service. Use the set of metrics for the selected client page to measure the quality of your client's experience when interacting with a service or application. These metrics include page loads, web vitals, and errors.

To display your client pages in the table, you must [configure your CloudWatch RUM web client for X-Ray tracing](CloudWatch-RUM-get-started-create-app-monitor.md) and turn on Application Signals metrics for your client pages. Choose **Manage pages** to select which pages are enabled for Application Signals metrics.

Use the filter text box to find the client page or application monitor that you are interested in below the filter text box. Choose **Clear filters** to remove the table filter. Select **Group by Client** to group client pages by client. When grouped, choose the **\$1** icon next to a client name to expand the row and see all pages for that client.

![\[Client pages table\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/service-client-pages-table.png)


To select a client page, select the option next to a client page in the **Client pages** table. You will see a set of graphs that display detailed metrics. Hover over a point in a graph to see a popup containing more information. Select a point in a graph to open a diagnostic pane that shows correlated performance navigation events for the selected point in the graph. Choose an event ID from the list of navigation events to open the [CloudWatch RUM Page view](CloudWatch-RUM-view-data.md) for the chosen event.

![\[CloudWatch RUM client page requests\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/service-client-page-graphs-events.jpg)


**Note**  
To see AJAX errors within your client pages, use the [CloudWatch RUM web client](CloudWatch-RUM-configure-client.md) version 1.15 or newer.  
 Up to 100 operations, canaries, and client pages, and up to 250 dependencies, can be shown per service. 

## View Related metrics
<a name="ServiceDetail-relatedmetrics"></a>

Use the Related metrics tab to visualize multiple metrics, identify correlation patterns, and determine root causes of issues.

The metrics table shows three types of metrics:
+ Standard metrics – Application Signals collects standard application metrics from the services that it discovers. For more information, see [Standard application metrics collected](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/AppSignals-MetricsCollected.html#AppSignals-StandardMetrics)
+ Runtime metrics – Application Signals uses the AWS Distro for OpenTelemetry SDK to automatically collect OpenTelemetry-compatible metrics from your Java and Python applications. For more information, see [Rumtime metrics](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/AppSignals-MetricsCollected.html#AppSignals-RuntimeMetrics)
+ Custom metrics – Application Signals enables you to generate custom metrics from your application. For more information, see [Custom metrics with Application Signals](AppSignals-CustomMetrics.md)

You can access the Related metrics tab from Service Overview, Service Operations, Dependencies, Synthetics canaries, or RUM tabs.

![\[View related metrics\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/Custom_metrics.png)

+ The left navigation panel starts with all operations and dependencies unselected
+ The graph initially shows the Fault metric from the operation with the highest fault rate

Before you begin correlation analysis, make sure you have data points visible in Service Operations or Dependencies. To analyze correlations:

1. Open the Service Operations or Dependencies page.

1. Select a data point on any graph.

1. In the right-panel, choose **Correlate with Other Metrics**.

1. On the **Related metrics** tab that opens, you'll see:
   + Your selected operation or dependency in the left navigation
   + Your selected metric graphed in the *Browse metrics* table
   + Correlated spans when you select a data point

To graph multiple metrics, select one or more metrics from the **Browse** view in the **Related metrics** tab. Choose **Graphed Metrics** to view all graphed metrics.

To filter metrics, use the left panel filters to focus on specific operations or dependencies and use the table header filter bar to search by name, type, or other attributes. These filtering options help you detect patterns and troubleshoot issues more efficiently.

To analyze related metrics in detail, select a data point in the **Related metrics** tab. You can then view:
+ Top Contributors – Analyzes metrics by running CloudWatch Logs Insights queries. These queries process Enhanced Metrics Format (EMF) records that contain key attributes for detailed analysis for the following:
  + Latency measurements
  + Fault occurrences
  + Service availability metrics

  The following metrics do not support Top Contributors:
  + OTEL Metrics
  + Server-side Span Metrics

  You can view Top Contributors for RED Metrics and Client-side Span Metrics.
+ Correlated Spans – The Correlated Spans section works consistently with the Service Operations tab. To help you identify related traces and metrics, the correlation mechanism works by:
  + Comparing metric names with span attributes
  + Identifying matching patterns during the selected time period
  + Displaying relevant trace information

  To effectively analyze your metrics and spans together, you need to understand how different metric types correlate. Here are the key limitations:
  + OTEL Metrics don't correlate with spans because they use independent naming systems
  + To correlate Server or Client-side Span Metrics with spans:
  + Include a Service dimension field in your configuration
  + Without this Service dimension, you cannot correlate these metrics with spans
+ Log Applications – For information on log application, see [Application logs](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/ServiceDetail.html#ServiceDetail-operations)

# View your application topology and monitor operational health with the CloudWatch application map
<a name="ServiceMap"></a>

**Note**  
The CloudWatch application map replaces the Service Map. To see a map of your application based on AWS X-Ray traces, open the [X-Ray Trace Map](https://docs.aws.amazon.com/xray/latest/devguide/xray-console-servicemap.html). Choose **Trace Map** under the **X-Ray** section in the left navigation pane of the CloudWatch console. 

After enabling your application for Application Signals, the application map displays nodes representing your groups. You then drill down in these groups to view your services and their dependencies. Use the application map to view the topology of your application clients, synthetics canaries, services and dependencies, and monitor operational health. To view the application map, open the [CloudWatch console](https://console.aws.amazon.com/cloudwatch/) and choose **Application Map** under the **Application Signals** section in the left navigation pane.



After you [enable your application for Application Signals](CloudWatch-Application-Signals-Enable.md), use the application map to make it easier to monitor your application's operational health:
+ View connections between client, canary, service, and dependency nodes to help you understand your application topology and execution flow. This is especially helpful if your service operators are not your development team. 
+ See which services are meeting or not meeting your [service level objectives (SLOs)](CloudWatch-ServiceLevelObjectives.md). When a service is not meeting your SLOs, you can quickly identify whether a downstream service or dependency might be contributing to the issue or impacting multiple upstream services. 
+ Select an individual client, synthetics canary, service, or dependency node to see related metrics. The [Service details](ServiceDetail.md) page shows more detailed information about operations, dependencies, synthetics canaries, and client pages. 
+ Filter and zoom the application map to make it easier to focus on a part of your application topology, or see the entire map. Create a filter by choosing one or more properties from the filter text box. As you choose each property, you are guided through filter criteria. You will see the complete filter below the filter text box. Choose **Clear filters** at any time to remove the filter. 
+ Monitor services across multiple AWS accounts in a single unified application map. Services from different accounts are clearly identified with account information, enabling unified observability for distributed applications.
+ Identify services not yet instrumented in your application. Application Signals automatically detects and displays services that haven't been instrumented yet, helping you achieve complete observability coverage. Un-instrumented services are visually distinguished on the map to help you prioritize instrumentation efforts.
+ Group and filter services to create customized views that match your workflows. This organization helps you quickly find and access the services you use most frequently
+ Save your filtered and grouped views to quickly return to frequently used configurations

## Explore the application map
<a name="Service-map-exploring"></a>

When you visit the application map, by default it shows services grouped by **Related services**. Related services group services based on their dependencies. For example, if Service A calls Service B, which calls Service C, they're grouped under Service A. You can view SLI health, metrics and service count for all services in each group.

![\[CloudWatch default application map grouped by related services.\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/explore-application-map-overview.png)


Choose a tab for information about exploring each kind of node and the edges (connections) between them.

### Dynamic grouping and filtering
<a name="Application-Map-Grouping"></a>

You can click the **Group by** dropdown to use different grouping options. By default, Application Map provides 2 groupings:
+ **Related services** - Groups services based on their dependencies
+ **Environment** - Groups services by their environment

If you want to define your own custom grouping, click **Manage groups** to define custom groups and then tag your services or add OTEL Resource Attributes with the group key.

**Note**  
To enable grouping via OTEL resource attributes, the CloudWatch agent version must be v1.300056.0 or later. 

![\[Create custom grouping panel\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/explore-application-map-create-custom-grouping.png)


Default grouping in Application Signals automatically organizes services based on their downstream dependencies. The system analyzes the service dependency graph and creates groups where the root node (a service with no upstream dependencies) becomes the group name. All services that depend on this root service, either directly or indirectly, are automatically included in the group. For example, if Service A calls Service B, which in turn calls Service C, all three services will be grouped together with Service A as the group name since it's the root of the dependency chain. This automatic grouping mechanism provides a natural way to visualize and manage related services based on their actual runtime interactions and dependencies.

### Group actions and insights
<a name="Application-Map-Group-Actions"></a>

For each group, you can perform the following actions:
+ Click **View more** to view metrics charts, the last two change events, and last deployment time for the group  
![\[View more drawer for group in application map\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/explore-application-map-view-more.png)
+ Click **View dashboard** to view metrics dashboard, change events table, and service list for the group  
![\[View application dashboard for group\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/explore-application-map-team-overview.png)  
![\[View application dashboard for group with metrics graphs\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/explore-application-map-team-overview-2.png)

You can use **Group and filter** on the left bar to filter groups which have services with deployment time, SLI health status or compute platform type.

![\[Grouping and filter services on the application dashboard\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/explore-application-map-grouping-filter.png)


You can also filter by account to view services from specific AWS accounts in your cross-account observability setup.

![\[Filter services by account on the application dashboard\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/explore-application-map-account-filter.png)


Use the **Search and filter** bar to search groups by name or search groups which contain specific service environment or dependency. Filter by account ID to focus on services from specific accounts.

![\[Search and filter services in application map\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/explore-application-map-search-and-filter.png)


### Configuring custom groups
<a name="Application-Map-Configure-Custom-Groups"></a>

Custom grouping allows you to organize your services logically based on your business requirements and operational priorities. This feature enables you to view and save defined views prioritized by your specific needs, create groups based on team ownership, and assemble groups of services needed for critical business transactions.

Create the custom group names (the group names you will see in the UI) and the corresponding group key names. Complete this step either from the Application Signals UI or using the [PutGroupingConfiguration](https://docs.aws.amazon.com/applicationsignals/latest/APIReference/API_PutGroupingConfiguration.html) API.

Group key names can be either, AWS tag key or OTEL resource attribute for your service. When deciding between tags and OTEL resource attributes, consider your compute platform:
+ For single-service platforms (for example, Lambda or Auto Scaling Group) – Use AWS tags
+ For multi-service platforms (for example, Amazon EKS cluster) – Use OTEL resource attributes for more granular grouping

**Adding AWS tags**

Add an AWS tag with the custom group key as a key and a value to an Amazon EKS cluster. When there are multiple services running in one Amazon EKS cluster all of them are tagged with the same custom group key. For example, when Amazon EKS Cluster A has Service 1, Service 2 and Service 3 running, adding an AWS tag with key *Team X* to the cluster will add all three services to *Team X*. To add only specific services to *Team X*, add OTEL resource attributes for the services as shown below.

**Adding OTEL resource attributes**

To add an OTEL resource attribute, see the configuration below:

**General configuration**

Configure the `OTEL_RESOURCE_ATTRIBUTES` environment variable in your application using the custom group key-value pairs. The keys are listed under `aws.application_signals.metric_resource_keys` separated by `&`.

For example, to create custom groups using `Application=PetClinic` and `Owner=Test`, use the following:

```
OTEL_RESOURCE_ATTRIBUTES=Application=PetClinic,Owner=Test,aws.application_signals.metric_resource_keys=Application&Owner
```

**Platform-specific configuration**

The following are the deployment specifications.

**Amazon EKS and native kubernetes**

```
apiVersion: apps/v1
kind: Deployment
metadata:
  ...
spec:
  replicas: 1
  ...
  template:
    spec:
      containers:
      - name: your-app
        image: your-app-image
        env:
          ...
          - name: OTEL_RESOURCE_ATTRIBUTES
            value: Application=PetClinic,Owner=Test,aws.application_signals.metric_resource_keys=Application&Owner
```

**Amazon EC2**

Add `OTEL_RESOURCE_ATTRIBUTES` to your application start script. For the complete example, see [Adding `OTEL_RESOURCE_ATTRIBUTES`](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Application-Signals-Enable-EC2Main.html#CloudWatch-Application-Signals-Monitor-EC2).

```
...
OTEL_RESOURCE_ATTRIBUTES="service.name=$YOUR_SVC_NAME,Application=PetClinic,Owner=Test,aws.application_signals.metric_resource_keys=Application&Owner" \
java -jar $MY_JAVA_APP.jar
```

**Amazon ECS**

Add `OTEL_RESOURCE_ATTRIBUTES` to the TaskDefinition. For the complete example, see [Enable on Amazon ECS](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Application-Signals-Enable-ECSMain.html).

```
{
  "name": "my-app",
   ...
  "environment": [
    {
      "name": "OTEL_RESOURCE_ATTRIBUTES",
      "value": "service.name=$YOUR_SVC_NAME,Application=PetClinic,Owner=Test,aws.application_signals.metric_resource_keys=Applicationmanagement portalOwner"
    }, 
    ...
  ]
}
```

**Lambda**

Add `OTEL_RESOURCE_ATTRIBUTES` to the Lambda environment variable.

```
OTEL_RESOURCE_ATTRIBUTES="Application=PetClinic,Owner=Test,aws.application_signals.metric_resource_keys=Application&Owner"
```

### Viewing services within groups
<a name="Application-Map-Service-View"></a>

To view services and their dependencies in a group, click on the Group name. It will show a map of services inside the group. Each service node will show SLI health, metrics and platform details. Services with SLI breach are highlighted to be easily recognizable.

![\[CloudWatch application map services within group.\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/View-services-groups.png)


Un-instrumented services are displayed with a distinctive visual indicator (such as a dashed border or different color) to differentiate them from instrumented services. Hover over an un-instrumented service node to see instrumentation guidance and links to setup documentation.

![\[Filter by uninstrumented services on application map\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/explore-application-map-uninstrumented-filter.png)


All Canaries, RUM Clients and AWS Service nodes will be collapsed by default. If services in this group call services which are not part of this group, they will also be collapsed by default.

![\[Canary nodes are collapsed into a group in application map\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/explore-application-map-canary-collapse.png)


If your map is still too large to investigate effectively, you can apply nested grouping to narrow down your investigation. For example, after grouping services by **Business Unit**, if you still have too many services in a group, use the Group by dropdown to select **Team**, creating a nested grouping structure.

![\[Nested grouping in application map\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/explore-application-map-nested-grouping.png)


### Service insights and details
<a name="Application-Map-Service-Details"></a>

While on this page you can also click **Save view** next to search bar to save your view so next time you don't have to apply the same grouping and filtering again.

![\[Save grouping configuration\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/explore-application-map-save-view.png)


Click on **View more** in service node to view Service Audit, Change events, SLI health and Metrics graphs.

![\[CloudWatch application map service insights.\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/explore-application-map-service-view-more.png)


If you want to view service operation and other service detail, click on **View dashboard** to go to service overview page.

![\[CloudWatch application map service overview.\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/explore-application-map-service-overview.png)


Alternatively you can click on Edge to view metrics of a specific dependency call of a service.

![\[CloudWatch application map node edge drawer\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/explore-application-map-edge.png)


### Change Events
<a name="Application-Map-Change-Events"></a>

Track change events across your application with Application Signals' automatic processing of CloudTrail events. Monitor configuration and deployment events for services and their dependencies, providing immediate context for operational analysis and troubleshooting. Change event detection is enabled alongside service discovery enablement through the CloudWatch Console or StartDiscovery API. For EKS services, deployment detection requires that the EKS services are instrumented with the Application Signals instrumentation SDK. Application Signals automatically correlates deployment times with performance changes, helping you quickly identify if recent deployments contributed to service issues. View change event history and impact across your services without additional configuration or setup requirements.

### Audit findings
<a name="Application-Map-Audit-Findings"></a>

Discover critical insights through Application Signals' audit findings. The service analyzes your applications to report significant observations and potential problems, simplifying root cause analysis. These automated findings consolidate relevant traces, eliminating the need to navigate through multiple clicks. The audit system helps teams quickly identify issues and their underlying causes, enabling faster problem resolution. 

For services running on Amazon Bedrock, Application Signals automatically monitors GenAI token usage patterns. The audit system detects anomalies in input and output token consumption, comparing current usage against historical baselines. When token usage exceeds normal patterns, audit findings provide detailed analysis including token consumption trends, cost implications, and recommendations for optimization. This helps teams identify inefficient prompts, unexpected token spikes, and opportunities to reduce GenAI operational costs.

### Cross-Account Observability on Application Map
<a name="Application-Map-Cross-Account"></a>

Application Signals supports cross-account observability, allowing you to monitor and visualize services distributed across multiple AWS accounts in a single unified application map. This capability is essential for organizations with multi-account architectures following AWS best practices.

**Key Capabilities:**
+ *Unified View*: View services from multiple AWS accounts in a single application map, providing a complete picture of your distributed application architecture.
+ *Account Identification*: Each service node clearly displays its account ID and region, making it easy to identify service ownership and location.
+ *Centralized Monitoring*: Monitor the health, performance, and SLO status of services across all connected accounts from a single monitoring account.
+ *Cross-Account Filtering*: Filter and group services by account ID to focus on specific accounts or view cross-account interactions.

**How It Works:**

Application Signals uses AWS Organizations and cross-account sharing to enable observability across multiple accounts. To setup cross account observability please refer to [CloudWatch cross-account observability](CloudWatch-Unified-Cross-Account.md).

------
#### [ View your application services ]

**Service (Instrumented)**

You can view your application services and the status of their SLOs and service level indicators (SLIs) in the **Application Map**. If you didn't create SLOs for a service, choose the **Create SLO** button below the service node.

 The **Application Map** displays all of your services. It also shows the customers and canaries that consume the service and the dependencies that your services calls, as shown in the following image:

![\[A CloudWatch application map displaying healthy and unhealthy service.\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/service-map-service-healthy-unhealthy.png)


When you select a service node, a pane opens displaying detailed service information: 
+ Total error and fault rate.
+ The number of SLIs and SLOs that are `healthy` or `unhealthy`. 
+ The option to view more information about an SLO.
+ The `Cluster`, `Namespace`, and `Workload` for services hosted in Amazon EKS, or Environment for services hosted in Amazon ECS or Amazon EC2. For Amazon EKS-hosted services, choose any link to open CloudWatch Container Insights.
+ AccountId and region.
+ The **Change** section showing recent change events and the last deployment time.
+ The **Operational Audit** tab providing automated audit findings and recommendations.
+ Service Metrics chart of Availability, latency, fault and errors.

Select an edge or connection between a service node and a downstream service or dependency node. This opens a pane containing top paths by fault rate, latency, and error rate, as shown in the following example image. Choose any link in the pane to open the [Service details](ServiceDetail.md) page and see detailed information for the chosen service or dependency.

![\[A CloudWatch application map service edge\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/App-signals-service-edge.png)


When you select a edge node, a pane opens displaying detailed service information: 
+ Total request count, latency, error rate and fault rate
+ Top path by fault rate
+ Top path by latency
+ Top path by error rate

**Service (Un-instrumented)**

Un-instrumented services appear on the Application Map even when they haven't been configured with Application Signals. These services are automatically discovered by leveraging Resource Explorer using application names and tags. The system can automatically detect up to 3,000 resources in your AWS account.

When you select an un-instrumented service node, a pane opens displaying:
+ Service name and identification information
+ AccountId and region where the service is detected
+ Instrumentation status and guidance
+ Call to action button "Enable Application Signals" that provides setup instructions
+ Compute platform type (if detectable)

Un-instrumented services help you:
+ Identify gaps in your observability coverage
+ Prioritize which services to instrument next based on their position in your architecture
+ Understand the complete application topology even before full instrumentation
+ Plan instrumentation rollout across your organization

**Note**  
Un-instrumented services display limited telemetry data since they don't actively send metrics or traces.

![\[CloudWatch application map instrumentation filter\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/explore-application-map-instrumentation-filter.png)


------
#### [ View dependencies ]

Your application dependencies are displayed on the application map, connected to the services that call them.

Choose a dependency node to open a pane containing error rate and fault rate, metrics chart for request, availability, latency, fault rate, and error rate.

 If the dependency node is a service or resource, then the pane will display change events for the requested time range.

![\[A CloudWatch application map displaying an expandable AWS service dependency node.\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/service-map-dependency.png)


------
#### [ View clients ]

After you [turn on X-Ray tracing](CloudWatch-RUM-get-started-create-app-monitor.md) for your CloudWatch RUM web clients, they display on the application map connected to services they call.

Choose a client node to open a pane displaying detailed client information:
+ Metrics for page loads, average load time, errors, and average web vitals
+ A graph displaying a breakdown of errors
+ A link to display the client details in CloudWatch RUM

![\[A CloudWatch application map displaying an expandable client node.\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/service-map-client.png)


Choose **View dashboard** to open the canary details.

------
#### [ View synthetics canaries ]

To view canaries on your application map, turn on [turn on X-Ray tracing](CloudWatch-RUM-get-started-create-app-monitor.md) for your CloudWatch Synthetics canaries. Once enabled, canaries will appear connected to their called services on the application map.

The system groups canaries together by default into a single expandable icon. The detailed canary information pane displays metrics, traces, and status information.

Choose a canary node to open a pane displaying detailed canary information, as shown in the following image:

![\[A CloudWatch application map displaying an expandable synthetics canary node.\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/service-map-canary.png)


Choose **View dashboard** to open the canary details.

------

# Application observability for AWS Action
<a name="Service-Application-Observability-for-AWS-GitHub-Action"></a>

The Application Observability for AWS GitHub Action provides an end-to-end application observability investigation workflow that connects your source code and live production telemetry data to AI agent. It leverages CloudWatch MCPs and generates custom prompts to provide the context that AI agents need for troubleshooting and applying code fixes.

The action sets up and configures [CloudWatch Application Signals MCP Server](https://awslabs.github.io/mcp/servers/cloudwatch-applicationsignals-mcp-server) and [CloudWatch MCP Server](https://awslabs.github.io/mcp/servers/cloudwatch-applicationsignals-mcp-server), enabling them to access live telemetry data as troubleshooting context. You can use your preferred AI model - whether through your own API key, a third-party model, or Amazon Bedrock - for application performance investigations.

To get started, mention `@awsapm` in your GitHub issues to trigger the AI agent. The agent will troubleshoot production issues, implement fixes, and enhance observability coverage based on your live application data.

This action itself does not incur any direct costs. However, using this action may result in charges for AWS services and AI model usage. Please refer to the [cost considerations documentation](https://github.com/marketplace/actions/application-observability-for-aws#-cost-considerations) for detailed information about potential costs.

## Getting Started
<a name="Service-Application-Observability-for-AWS-GitHub-Action-getting-started"></a>

This action configures AI agents within your GitHub workflow by generating AWS-specific MCP configurations and custom observability prompts. You only need to provide IAM role to assume and a [Bedrock Model ID](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) you want to use, or API token from your existing LLM subscription. The example below demonstrates a workflow template that integrates this action with [Anthropic's claude-code-base-action](https://github.com/anthropics/claude-code-base-action) to run automated investigations.

### Prerequisites
<a name="Service-Application-Observability-for-AWS-GitHub-Action-prerequisites"></a>

Before you begin, ensure you have the following:
+ **GitHub Repository Permissions**: Write access or higher to the repository (required to trigger the action)
+ **AWS IAM Role**: An IAM role configured with OpenID Connect (OIDC) for GitHub Actions with permissions for:
  + CloudWatch Application Signals and CloudWatch access
  + Amazon Bedrock model access (if using Bedrock models)
+ **GitHub Token**: The workflow automatically uses GITHUB\$1TOKEN with the required permissions

### Setup Steps
<a name="Service-Application-Observability-for-AWS-GitHub-Action-setup-steps"></a>

#### Step 1: Set up AWS Credentials
<a name="Service-Application-Observability-for-AWS-GitHub-Action-step1"></a>

This action relies on the [aws-actions/configure-aws-credentials](https://github.com/aws-actions/configure-aws-credentials) action to set up AWS authentication in your GitHub Actions Environment. We recommend using OpenID Connect (OIDC) to authenticate with AWS. OIDC allows your GitHub Actions workflows to access AWS resources using short-lived AWS credentials so you do not have to store long-term credentials in your repository.

1. **Create an IAM Identity Provider**

   First, create an IAM Identity Provider that trusts GitHub's OIDC endpoint in the AWS Management Console:

   1. Open the IAM console

   1. Click **Identity providers** under **Access management**

   1. Click the **Add provider** button to add GitHub Identity provider if not yet created

   1. Select **OpenID Connect** type of Identity provider

   1. Enter `https://token.actions.githubusercontent.com` for the **Provider URL** input box

   1. Enter `sts.amazonaws.com` for the **Audience** input box

   1. Click the **Add provider** button

1. **Create an IAM Policy**

   Create an IAM policy with the required permissions for this action. See the [Required Permissions](#Service-Application-Observability-for-AWS-GitHub-Action-required-permissions) section below for details.

1. **Create an IAM Role**

   Create an IAM role (for example, `AWS_IAM_ROLE_ARN`) in the AWS Management Console with the following trust policy template. This allows authorized GitHub repositories to assume the role:

   ```
   {
     "Version": "2012-10-17",		 	 	 
     "Statement": [
       {
         "Effect": "Allow",
         "Principal": {
           "Federated": "arn:aws:iam::<AWS_ACCOUNT_ID>:oidc-provider/token.actions.githubusercontent.com"
         },
         "Action": "sts:AssumeRoleWithWebIdentity",
         "Condition": {
           "StringEquals": {
             "token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
           },
           "StringLike": {
             "token.actions.githubusercontent.com:sub": "repo:<GITHUB_ORG>/<GITHUB_REPOSITORY>:ref:refs/heads/<GITHUB_BRANCH>"
           }
         }
       }
     ]
   }
   ```

   Replace the following placeholders in the template:
   + `<AWS_ACCOUNT_ID>` - Your AWS account ID
   + `<GITHUB_ORG>` - Your GitHub organization name
   + `<GITHUB_REPOSITORY>` - Your repository name
   + `<GITHUB_BRANCH>` - Your branch name (e.g., main)

1. **Attach the IAM Policy**

   In the role's Permissions tab, attach the IAM policy you created in step 2.

For more information about configuring OIDC with AWS, see the [configure-aws-credentials OIDC Quick Start Guide](https://github.com/aws-actions/configure-aws-credentials/tree/main?tab=readme-ov-file#quick-start-oidc-recommended).

#### Step 2: Configure Secrets and Add Workflow
<a name="Service-Application-Observability-for-AWS-GitHub-Action-step2"></a>

1. **Configure Repository Secrets**

   Go to your repository → Settings → Secrets and variables → Actions.
   + Create a new repository secret named `AWS_IAM_ROLE_ARN` and set its value to the ARN of the IAM role you created in Step 1.
   + (Optional) Create a repository variable named `AWS_REGION` to specify your AWS region (defaults to `us-east-1` if not set)

1. **Add the Workflow File**

   The following is an example workflow that demonstrates using this action with Amazon Bedrock models. Create Application Observability Investigation workflow from this template in your GitHub Repository directory `.github/workflows`.

   ```
   name: Application observability for AWS
   
   on:
     issue_comment:
       types: [created, edited]
     issues:
       types: [opened, assigned, edited]
   
   jobs:
     awsapm-investigation:
       if: |
         (github.event_name == 'issue_comment' && contains(github.event.comment.body, '@awsapm')) ||
         (github.event_name == 'issues' && (contains(github.event.issue.body, '@awsapm') || contains(github.event.issue.title, '@awsapm')))
       runs-on: ubuntu-latest
   
       permissions:
         contents: write        # To create branches for PRs
         pull-requests: write   # To post comments on PRs
         issues: write          # To post comments on issues
         id-token: write        # Required for AWS OIDC authentication
   
       steps:
         - name: Checkout repository
           uses: actions/checkout@v4
   
         - name: Configure AWS credentials
           uses: aws-actions/configure-aws-credentials@v4
           with:
             role-to-assume: ${{ secrets.AWS_IAM_ROLE_ARN }}
             aws-region: ${{ vars.AWS_REGION || 'us-east-1' }}
   
         # Step 1: Prepare AWS MCP configuration and investigation prompt
         - name: Prepare Investigation Context
           id: prepare
           uses: aws-actions/application-observability-for-aws@v1
           with:
             bot_name: "@awsapm"
             cli_tool: "claude_code"
   
         # Step 2: Execute investigation with Claude Code
         - name: Run Claude Investigation
           id: claude
           uses: anthropics/claude-code-base-action@beta
           with:
             use_bedrock: "true"
             # Set to any Bedrock Model ID
             model: "us.anthropic.claude-sonnet-4-5-20250929-v1:0"
             prompt_file: ${{ steps.prepare.outputs.prompt_file }}
             mcp_config: ${{ steps.prepare.outputs.mcp_config_file }}
             allowed_tools: ${{ steps.prepare.outputs.allowed_tools }}
   
         # Step 3: Post results back to GitHub issue/PR (reuse the same action)
         - name: Post Investigation Results
           if: always()
           uses: aws-actions/application-observability-for-aws@v1
           with:
             cli_tool: "claude_code"
             comment_id: ${{ steps.prepare.outputs.awsapm_comment_id }}
             output_file: ${{ steps.claude.outputs.execution_file }}
             output_status: ${{ steps.claude.outputs.conclusion }}
   ```

   **Configuration Note:**
   + This workflow triggers automatically when `@awsapm` is mentioned in an issue or comment
   + The workflow uses the `AWS_IAM_ROLE_ARN` secret configured in the previous step
   + Update the model parameter in Step 2 to specify your preferred Amazon Bedrock model ID
   + You can customize the bot name (e.g., `@awsapm-prod`, `@awsapm-staging`) in the bot\$1name parameter to support different environments

#### Step 3: Start Using the Action
<a name="Service-Application-Observability-for-AWS-GitHub-Action-step3"></a>

Once the workflow is configured, mention `@awsapm` in any GitHub issue to trigger an AI-powered investigation. The action will analyze your request, access live telemetry data, and provide recommendations or implement fixes automatically.

**Example Use Cases:**

1. Investigate performance issues and post and fix:

   `@awsapm, can you help me investigate availability issues in my appointment service?`  
![\[alt text not found\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/github-availability-issue-investigate.png)

   `@awsapm, can you post a fix?`  
![\[alt text not found\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/github-availability-issue-pr-fix.png)

1. Enable instrumentation:

   `@awsapm, please enable Application Signals for lambda-audit-service and create a PR with the required changes.`

1. Query telemetry data:

   `@awsapm, how many GenAI tokens have been consumed by my services in the past 24 hours?`

**What Happens Next:**

1. The workflow detects the `@awsapm` mention and triggers the investigation

1. The AI agent accesses your live AWS telemetry data through the configured MCP servers

1. The agent analyzes the issue and either:
   + Posts findings and recommendations directly in the issue
   + Creates a pull request with code changes (for instrumentation or fixes)

1. You can review the results and continue the conversation by mentioning @awsapm again with follow-up questions

## Security
<a name="Service-Application-Observability-for-AWS-GitHub-Action-security"></a>

This action prioritizes security with strict access controls, OIDC-based AWS authentication, and built-in protections against prompt injection attacks. Only users with write access or higher can trigger the action, and all operations are scoped to the specific repository.

For detailed security information, including:
+ Access control and permission requirements
+ AWS IAM permissions and OIDC configuration
+ Prompt injection risks and mitigations
+ Security best practices

See the [Security Documentation](https://github.com/aws-actions/application-observability-for-aws/blob/main/docs/security.md).

## Configuration
<a name="Service-Application-Observability-for-AWS-GitHub-Action-configuration"></a>

### Required Permissions
<a name="Service-Application-Observability-for-AWS-GitHub-Action-required-permissions"></a>

The IAM role assumed by GitHub Actions must have the following permissions.

**Note**: `bedrock:InvokeModel` and `bedrock:InvokeModelWithResponseStream` are only required if you're using Amazon Bedrock models

```
{
    "Version": "2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "application-signals:ListServices",
                "application-signals:GetService",
                "application-signals:ListServiceOperations",
                "application-signals:ListServiceLevelObjectives",
                "application-signals:GetServiceLevelObjective",
                "application-signals:ListAuditFindings",
                "cloudwatch:DescribeAlarms",
                "cloudwatch:DescribeAlarmHistory",
                "cloudwatch:ListMetrics",
                "cloudwatch:GetMetricData",
                "cloudwatch:GetMetricStatistics",
                "logs:DescribeLogGroups",
                "logs:DescribeQueryDefinitions",
                "logs:ListLogAnomalyDetectors",
                "logs:ListAnomalies",
                "logs:StartQuery",
                "logs:StopQuery",
                "logs:GetQueryResults",
                "logs:FilterLogEvents",
                "xray:GetTraceSummaries",
                "xray:GetTraceSegmentDestination",
                "xray:BatchGetTraces",
                "xray:ListRetrievedTraces",
                "xray:StartTraceRetrieval",
                "servicequotas:GetServiceQuota",
                "synthetics:GetCanary",
                "synthetics:GetCanaryRuns",
                "s3:GetObject",
                "s3:ListBucket",
                "iam:GetRole",
                "iam:ListAttachedRolePolicies",
                "iam:GetPolicy",
                "iam:GetPolicyVersion",
                "bedrock:InvokeModel",
                "bedrock:InvokeModelWithResponseStream"
            ],
            "Resource": "*"
        }
    ]
}
```

## Documentation
<a name="Service-Application-Observability-for-AWS-GitHub-Action-documentation"></a>

For more information, check out:
+ [CloudWatch Application Signals Documentation](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Application-Monitoring-Intro.html) - Learn about CloudWatch Application Signals features and capabilities
+ [Application observability for AWS Action Public Documentation](https://github.com/marketplace/actions/application-observability-for-aws) - Detailed guides and tutorials

# Example: Use Application Signals to resolve an operational health issue
<a name="Services-example-scenario"></a>

The following scenario provides an example of how Application Signals can be used to monitor your services and identify service quality issues. Drill down to identify potential root causes and take action to resolve the issue. This example is focused on a pet clinic application composed of several microservices that call AWS services such as DynamoDB. 

Jane is part of a DevOps team that oversees the operational health of a pet clinic application. Jane's team is committed to ensuring that the application is highly available and responsive. They use [service level objectives (SLOs)](CloudWatch-ServiceLevelObjectives.md) to measure application performance against these business commitments. She receives an alert about several unhealthy service level indicators (SLIs). She opens the CloudWatch console and navigates to the Services page, where she sees several services in an unhealthy state.

![\[Services with unhealthy SLIs\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/example-scenario-services-page.jpg)


At the top of the page, Jane sees that the `visits-service` is the top service by fault rate. She selects the link in the graph, which opens the Service detail page for the service. She sees that there is an unhealthy operation in the Service operations table. She selects this operation and sees in the Volume and Availability graph that there are periodic call volume spikes that seem to correlate to dips in availability. 

![\[Service operation volume and availability\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/example-scenario-unhealthy-operation.png)


In order to look closer at the dips in service availability, Jane selects one of the availability data points in the graph. A drawer opens showing X-Ray traces that are correlated to the selected data point. She sees that there are multiple traces containing faults. 

![\[Service availability and correlated traces\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/example-scenario-correlated-traces.jpg)


Jane selects one of the correlated traces with a fault status, which opens the X-Ray Trace detail page for the selected trace. Jane scrolls down to the Segments Timeline section and follows the call path until she sees that calls to a DynamoDB table are returning errors. She selects the DynamoDB segment and navigates to the Exceptions tab of the right-side drawer. 

![\[Trace segment with DynamoDB errors\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/example-scenario-DDB-segment.jpg)


Jane sees that a DynamoDB resource is misconfigured, resulting in errors during spikes in customer requests. The DynamoDB table's level of provisioned throughput is periodically exceeded, resulting in service availability issues and unhealthy SLIs. Based on this information, her team is able to configure a higher level of provisioned throughput and ensure high availability of the application. 

# Example: Use Application Signals to troubleshoot generative AI applications interacting with Amazon Bedrock models
<a name="Services-example-scenario-GenerativeAI"></a>

You can use Application Signals to troubleshoot your generative AI applications that interact with Amazon Bedrock models. Application Signals streamlines this process by providing out-of-the-box telemetry data, offering deeper insights into your application's interactions with LLM models. It helps address key use cases such as:
+ Model configuration issues
+ Model usage costs
+ Model latency
+ Model response generation stopped reasons

[Enabling Application Signals](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Application-Signals-Enable.html) with LLM/GenAI Observability provides real-time visibility into your application's interactions with Amazon Bedrock services. Application Signals automatically generates and correlates performance metrics and traces for Amazon Bedrock API calls.

Application Signals currently support the following LLM Models from Amazon Bedrock.
+ AI21 Jamba
+ Amazon Titan
+ Anthropic Claude
+ Cohere Command
+ Meta Llama
+ Mistral AI
+ Nova

## Fine-grained metrics and traces
<a name="Services-example-scenario-GenerativeAI-metricandtraces"></a>

For each Amazon Bedrock API call, Application Signals generates detailed performance metrics at the resource level, including:
+ Model ID
+ Guardrails ID
+ Knowledge Base ID
+ Bedrock Agent ID

Additionally, correlated trace spans at the same level help provide a comprehensive view of request execution and dependencies.

![\[Performance metrics using Application Signals.\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/AppSignalsAIExample.png)


## OpenTelemetry GenAI attributes support
<a name="Services-example-scenario-GenerativeAI-OpenTelemetryAISupport"></a>

Application Signals generates the following GenAI attributes for Amazon Bedrock API calls with OpenTelemetry semantic convention. These attributes help analyze model usage, cost, and response quality, and can be leveraged through [Transaction Search](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Transaction-Search.html) for deeper insights.
+ gen\$1ai.system
+ gen\$1ai.request.model
+ gen\$1ai.request.max\$1tokens
+ gen\$1ai.request.temperature
+ gen\$1ai.request.top\$1p
+ gen\$1ai.usage.input\$1tokens
+ gen\$1ai.usage.output\$1tokens
+ gen\$1ai.response.finish\$1reasons

![\[GenAI attributes using Application Signals.\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/AppSignalsAIExample_1.png)


For example, your can leverage the analytic capability from Transaction Search to compare the token usage and cost across different LLM models for the same prompt, enabling cost-efficient model selection.

![\[GenAI attributes using Application Signals.\]](http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/images/AppSignalsAIExample_2.png)


For more information, see [Improve Amazon Bedrock Observability with CloudWatch Application Signals](https://aws.amazon.com/blogs/mt/improve-amazon-bedrock-observability-with-amazon-cloudwatch-appsignals/).