

# (Optional) Configuring Application Signals
<a name="CloudWatch-Application-Signals-Configure"></a>

This section contains information about configuring CloudWatch Application Signals.

**Topics**
+ [Trace sampling rate](Application-Signals-SampleRate.md)
+ [Enable trace to log correlation](Application-Signals-TraceLogCorrelation.md)
+ [Enable metric to log correlation](Application-Signals-MetricLogCorrelation.md)
+ [Manage high-cardinality operations](Application-Signals-Cardinality.md)

# Trace sampling rate
<a name="Application-Signals-SampleRate"></a>

By default, when you enable Application Signals X-Ray centralized sampling is enabled using the default sampling rate settings of `reservoir=1/s` and `fixed_rate=5%`. The environment variables for the AWS Distro for OpenTelemetry (ADOT) SDK agent as set as follows.


| Environment variable | Value | Note | 
| --- | --- | --- | 
| `OTEL_TRACES_SAMPLER` | `xray` |  | 
| `OTEL_TRACES_SAMPLER_ARG` | `endpoint=http://cloudwatch-agent.amazon-cloudwatch:2000` | Endpoint of the CloudWatch agent | 

For information about changing the sampling configuration, see the following:
+ To change X-Ray sampling, see [ Configure sampling rules](https://docs.aws.amazon.com/xray/latest/devguide/aws-xray-interface-console.html#xray-console)
+ To change ADOT sampling, see [ Configuring the OpenTelemetry Collector for X-Ray remote sampling](https://aws-otel.github.io/docs/getting-started/remote-sampling)

If you want to disable X-Ray centralized sampling and use local sampling instead, set the following values for the ADOT SDK Java agent as below. The following example sets the sampling rate at 5%.


| Environment variable | Value | 
| --- | --- | 
| `OTEL_TRACES_SAMPLER` | `parentbased_traceidratio` | 
| `OTEL_TRACES_SAMPLER_ARG` | `0.05` | 

For information about more advanced sampling settings, see [ OTEL\$1TRACES\$1SAMPLER](https://opentelemetry.io/docs/concepts/sdk-configuration/general-sdk-configuration/#otel_traces_sampler).

# Enable trace to log correlation
<a name="Application-Signals-TraceLogCorrelation"></a>

You can enable *trace to log correlation* in Application Signals. This automatically injects trace IDs and span IDs into the relevant application logs. Then, when you open a trace detail page in the Application Signals console, the relevant log entries (if any) that correlate with the current trace automatically appear at the bottom of the page.

For example, suppose you notice a spike in a latency graph. You can choose the point on the graph to load the diagnostics information for that point in time. You then choose the relevant trace to get more information. When you view the trace information, you can scroll down to see the logs associated with the trace. These logs might reveal patterns or error codes associated with the issues causing the latency spike.

To achieve trace log correlation, Application Signals relies on the following:
+ [ Logger MDC auto-instrumentation](https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/docs/logger-mdc-instrumentation.md) for Java.
+ [ OpenTelemetry Logging Instrumentation](https://opentelemetry-python-contrib.readthedocs.io/en/latest/instrumentation/logging/logging.html) for Python.
+ The [ Pino](https://www.npmjs.com/package/@opentelemetry/instrumentation-pino), [ Winston](https://www.npmjs.com/package/@opentelemetry/instrumentation-winston), or [ Bunyan](https://www.npmjs.com/package/@opentelemetry/instrumentation-bunyan) auto-instrumentations for Node.js.

All of these isntrumentations are provided by OpenTelemetry community. Application Signals uses them to inject trace contexts such as trace ID and span ID into application logs. To enable this, you must manually change your logging configuration to enable the auto-instrumentation. 

Depending on the architecture that your application runs on, you might have to also set an environment variable to enable trace log correlation, in addition to following the steps in this section.
+ On Amazon EKS, no further steps are needed.
+ On Amazon ECS, no further steps are needed.
+ On Amazon EC2, see the step 4 in the procedure in [Step 3: Instrument your application and start it](CloudWatch-Application-Signals-Enable-EC2Main.md#CloudWatch-Application-Signals-Enable-Other-instrument).

After you enable trace log correlation, 

## Trace log correlation setup examples
<a name="Application-Signals-TraceLogCorrelation-Examples"></a>

This section contains examples of setting up trace log correlation in several environments.

**Spring Boot for Java**

Suppose you have a Spring Boot application in a folder called `custom-app`. The application configuration is usually a YAML file named `custom-app/src/main/resources/application.yml` that might look like this: 

```
spring:
  application:
    name: custom-app
  config:
    import: optional:configserver:${CONFIG_SERVER_URL:http://localhost:8888/}
    
...
```

To enable trace log correlation, add the following logging configuration.

```
spring:
  application:
    name: custom-app
  config:
    import: optional:configserver:${CONFIG_SERVER_URL:http://localhost:8888/}
    
...    

logging:
  pattern:
    level: trace_id=%mdc{trace_id} span_id=%mdc{span_id} trace_flags=%mdc{trace_flags} %5p
```

**Logback for Java**

In the logging configuration (such as logback.xml), insert the trace context `trace_id=%mdc{trace_id} span_id=%mdc{span_id} trace_flags=%mdc{trace_flags} %5p` into `pattern` of Encoder. For example, the following configuration prepends the trace context before the log message.

```
<appender name="FILE" class="ch.qos.logback.core.FileAppender">
  <file>app.log</file>
  <append>true</append>
  <encoder> 
    <pattern>trace_id=%mdc{trace_id} span_id=%mdc{span_id} trace_flags=%mdc{trace_flags} %5p - %m%n</pattern> 
  </encoder>
</appender>
```

For more information about encoders in Logback, see [ Encoders](https://logback.qos.ch/manual/encoders.html) in the Logback documentation.

**Log4j2 for Java**

In the logging configuration (such as log4j2.xml), insert the trace context `trace_id=%mdc{trace_id} span_id=%mdc{span_id} trace_flags=%mdc{trace_flags} %5p` into `PatternLayout`. For example, the following configuration prepends the trace context before the log message.

```
<Appenders>
  <File name="FILE" fileName="app.log">
    <PatternLayout pattern="trace_id=%mdc{trace_id} span_id=%mdc{span_id} trace_flags=%mdc{trace_flags} %5p - %m%n"/>
  </File>
</Appenders>
```

For more information about pattern layouts in Log4j2, see [ Pattern Layout](https://logging.apache.org/log4j/2.x/manual/layouts.html#Pattern_Layout) in the Log4j2 documentation.

**Log4j for Java **

In the logging configuration (such as log4j.xml), insert the trace context `trace_id=%mdc{trace_id} span_id=%mdc{span_id} trace_flags=%mdc{trace_flags} %5p` into `PatternLayout`. For example, the following configuration prepends the trace context before the log message.

```
<appender name="FILE" class="org.apache.log4j.FileAppender">;
  <param name="File" value="app.log"/>;
  <param name="Append" value="true"/>;
  <layout class="org.apache.log4j.PatternLayout">;
    <param name="ConversionPattern" value="trace_id=%mdc{trace_id} span_id=%mdc{span_id} trace_flags=%mdc{trace_flags} %5p - %m%n"/>;
  </layout>;
</appender>;
```

For more information about pattern layouts in Log4j, see [ Class Pattern Layout](https://logging.apache.org/log4j/1.x/apidocs/org/apache/log4j/PatternLayout.html) in the Log4j documentation.

**Python**

Set the environment variable `OTEL_PYTHON_LOG_CORRELATION` to `true` while running your application. For more information, see [ Enable trace context injection](https://opentelemetry-python-contrib.readthedocs.io/en/latest/instrumentation/logging/logging.html#enable-trace-context-injection)in the Python OpenTelemetry documentation.

**Node.js**

For more information about enabling trace context injection in Node.js for the logging libraries that support it, see the NPM usage documentations of the [ Pino](https://www.npmjs.com/package/@opentelemetry/instrumentation-pino), [ Winston](https://www.npmjs.com/package/@opentelemetry/instrumentation-winston), or [ Bunyan](https://www.npmjs.com/package/@opentelemetry/instrumentation-bunyan) auto-instrumentations for Node.js.

# Enable metric to log correlation
<a name="Application-Signals-MetricLogCorrelation"></a>

If you publish application logs to log groups in CloudWatch Logs, you can enable *metric to application log correlation* in Application Signals. With metric log correlation, the Application Signals console automatically displays the relevant log groups associated with a metric.

For example, suppose you notice a spike in a latency graph. You can choose a point on the graph to load the diagnostics information for that point in time. The diagnostics information will show the relevant application log groups that are associated with the current service and metric. Then you can choose a button to run a CloudWatch Logs Insights query on those log groups. Depending on the information contained in the application logs, this might help you to investigate the cause of the latency spike.

Depending on the architecture that your application runs on, you might have to also set an environment variable to enable metric to application log correlation.
+ On Amazon EKS, no further steps are needed.
+ On Amazon ECS, no further steps are needed.
+ On Amazon EC2, see step 4 in the procedure in [Step 3: Instrument your application and start it](CloudWatch-Application-Signals-Enable-EC2Main.md#CloudWatch-Application-Signals-Enable-Other-instrument).

# Manage high-cardinality operations
<a name="Application-Signals-Cardinality"></a>

Application Signals includes settings in the CloudWatch agent that you can use to manage the cardinality of your operations and manage the metric exportation to optimize costs. By default, the metric limiting function becomes active when the number of distinct operations for a service over time exceeds the default threshold of 500. You can tune the behavior by adjusting the configuration settings. 

## Determine if metric limiting is activated
<a name="Limiting-Activated"></a>

You can use the following methods to find if the default metric limiting is happening. If it is, you should consider optimizing the cardinality control by following the steps in the next section.
+ In the CloudWatch console, choose **Application Signals**, **Services**. If you see an **Operation** named **AllOtherOperations** or a **RemoteOperation** named **AllOtherRemoteOperations**, then metric limiting is happening.
+ If any metrics collected by Application Signals have the value `AllOtherOperations` for their `Operation` dimension, then metric limiting is happening.
+ If any metrics collected by Application Signals have the value `AllOtherRemoteOperations` for their `RemoteOperation` dimension, then metric limiting is happening.

### Optimize cardinality control
<a name="Optimize-Cardinality"></a>

To optimize your cardinality control, you can do the following:
+ Create custom rules to aggregate operations.
+ Configure your metric limiting policy.

#### Create custom rules to aggregate operations
<a name="Optimize-Cardinality-Custom-Rules"></a>

High-cardinality operations can sometimes be caused by inappropriate unique values extracted from the context. For example, sending out HTTP/S requests that include user IDs or session IDs in the path can lead to hundreds of disparate operations. To resolve such issues, we recommend that you configure the CloudWatch agent with customization rules to rewrite these operations.

In cases where there is a surge in generating numerous different metrics through individual `RemoteOperation` calls, such as `PUT /api/customer/owners/123`, `PUT /api/customer/owners/456`, and similar requests, we recommend that you consolidate these operations into a single `RemoteOperation`. One approach is to standardize all `RemoteOperation` calls that start with `PUT /api/customer/owners/` to a uniform format, specifically `PUT /api/customer/owners/{ownerId}`. The following example illustrates this. For information about other customization rules, see [Enable CloudWatch Application Signals](CloudWatch-Agent-Application_Signals.md).

```
{
   "logs":{
      "metrics_collected":{
         "application_signals":{
            "rules":[
               {
                  "selectors":[
                     {
                        "dimension":"RemoteOperation",
                        "match":"PUT /api/customer/owners/*"
                     }
                  ],
                  "replacements":[
                     {
                        "target_dimension":"RemoteOperation",
                        "value":"PUT /api/customer/owners/{ownerId}"
                     }
                  ],
                  "action":"replace"
               }
            ]
         }
      }
   }
}
```

In other cases, high-cardinality metrics might have been aggregated to `AllOtherRemoteOperations`, and it might be unclear what specific metrics are included. The CloudWatch agent is able to log the dropped operations. To identify dropped operations, use the configuration in the following example to activate logging until the problem resurfaces. Then inspect the CloudWatch agent logs (accessible by container `stdout` or EC2 log files) and search for the keyword `drop metric data`.

```
{
  "agent": {
    "config": {
      "agent": {
        "debug": true
      },
      "traces": {
        "traces_collected": {
          "application_signals": {
          }
        }
      },
      "logs": {
        "metrics_collected": {
          "application_signals": {
            "limiter": {
              "log_dropped_metrics": true
            }
          }
        }
      }
    }
  }
}
```

#### Create your metric limiting policy
<a name="Optimize-Cardinality-Metric-Limiting"></a>

If the default metric limiting configuration doesn’t address the cardinality for your service, you can customize the metric limiter configuration. To do this, add a `limiter` section under the `logs/metrics_collected/application_signals` section in the CloudWatch Agent configuration file.

The following example lowers the threshold of metric limiting from 500 distinct metrics to 100.

```
{
  "logs": {
    "metrics_collected": {
      "application_signals": {
        "limiter": {
          "drop_threshold": 100
        }
      }
    }
  }
}
```