# Monitoring AWS Elemental MediaTailor with Amazon CloudWatch metrics
<a name="monitoring-cloudwatch-metrics"></a>

You can monitor AWS Elemental MediaTailor metrics using CloudWatch. CloudWatch collects raw data about the performance of the service and processes that data into readable, near real-time metrics. These statistics are kept for 15 months, so that you can access historical information and gain a better perspective on how your web application or service is performing. You can also set alarms that watch for certain thresholds, and send notifications or take actions when those thresholds are met. For more information, see the [Amazon CloudWatch User Guide](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/).

Metrics can be useful when you investigate stale manifests. For more information, see [Using metrics to diagnose stale manifests](stale-manifest-diagnose.md).

Metrics are grouped first by the service namespace, and then by the various dimension combinations within each namespace.

**To view metrics using the CloudWatch console**

1. Open the CloudWatch console at [https://console.aws.amazon.com/cloudwatch/](https://console.aws.amazon.com/cloudwatch/).

1. In the navigation pane, choose **Metrics**.

1. Under **All metrics**, choose the **MediaTailor** namespace. 

1. Select the metric dimension to view the metrics (for example, **originID**).

1. Specify the time period that you want to view. 

**To view metrics using the AWS Command Line Interface (AWS CLI)**
+ At a command prompt, use the following command:

  ```
  aws cloudwatch list-metrics --namespace "AWS/MediaTailor"
  ```

## AWS Elemental MediaTailor CloudWatch metrics
<a name="metrics"></a>

The AWS Elemental MediaTailor namespace includes the following metrics. These metrics are published by default to your account. 

### Channel Assembly (CA) metrics
<a name="metrics.channel-assembly"></a>

In the following table, all metrics are available by channel or by channel output.


| Metric | Description | 
| --- | --- | 
|  4xxErrorCount  |  The number of `4xx` errors.  | 
|  5xxErrorCount  |  The number of `5xx` errors.  | 
|  RequestCount  |  The total number of requests. The transaction count depends largely on how often players request updated manifests, and the number of players. Each player request counts as a transaction.  | 
|  TotalTime  |  The amount of time that the application server took to process the request, including the time used to receive bytes from and write bytes to the client and network.   | 

#### Server-side Ad-insertion (SSAI) metrics
<a name="metrics.server-side-ad-insertion"></a>

The following table lists server-side ad-insertion metrics.


| Metric | Description | 
| --- | --- | 
|  AdDecisionServer.Ads  |  The count of ads included in ad decision server (ADS) responses within the CloudWatch time period that you specified.  | 
|  AdDecisionServer.Duration  |  The total duration, in milliseconds, of all ads that MediaTailor received from the ADS within the CloudWatch time period that you specified. This duration can be greater than the `Avail.Duration` that you specified.  | 
|  AdDecisionServer.Errors  |  The number of non-HTTP 200 status code responses, empty responses, and timed-out responses that MediaTailor received from the ADS within the CloudWatch time period that you specified.  | 
|  AdDecisionServer.FillRate  |  The simple average of the rates at which the responses from the ADS filled the corresponding individual ad avails for the time period that you specified. To get the weighted average, calculate the `AdDecisionServer.Duration` as a percentage of the `Avail.Duration`. For more information about simple and weighted averages, see [Simple and weighted averages](#metrics-simple-average).  | 
|  AdDecisionServer.Latency  |  The response time in milliseconds for requests made by MediaTailor to the ADS.  | 
|  AdDecisionServer.Timeouts  |  The number of timed-out requests to the ADS in the CloudWatch time period that you specified.  | 
|  AdNotReady  |  The number of times that the ADS pointed at an ad that wasn't yet transcoded by the internal transcoder service in the time period that you specified. A high value for this metric might contribute to a low overall `Avail.FillRate`.  | 
|  AdsBilled  |  The number of ads for which MediaTailor bills customers based on insertion.  | 
|  Avail.Duration  |  The planned total number of milliseconds of ad avails within the CloudWatch time period. The planned total is based on the ad avail durations in the origin manifest.  | 
|  Avail.FilledDuration  |  The planned number of milliseconds of ad avail time that MediaTailor will fill with ads within the CloudWatch time period.  | 
|  Avail.FillRate  |  The planned simple average of the rates at which MediaTailor will fill individual ad avails within the CloudWatch time period. To get the weighted average, calculate the `Avail.FilledDuration` as a percentage of the `Avail.Duration`. For more information about simple and weighted averages, see [Simple and weighted averages](#metrics-simple-average). The maximum `Avail.FillRate` that MediaTailor can attain is bounded by the `AdDecisionServer.FillRate`. If the `Avail.FillRate` is low, compare it to the `AdDecisionServer.FillRate`. If the `AdDecisionServer.FillRate` is low, your ADS might not be returning enough ads for the avail durations.   | 
|  Avail.Impression  |  The number of ads with impression tracking events that MediaTailor sees during server-side beaconing (not the number of impressions).  | 
|  Avail.ObservedDuration  |  The observed total number of milliseconds of ad avails that occurred within the CloudWatch time period. `Avail.ObservedDuration` is emitted at the end of the ad avail, and is based on the duration of the segments reported in the manifest during the ad avail.  | 
|  Avail.ObservedFilledDuration  |  The observed number of milliseconds of ad avail time that MediaTailor filled with ads within the CloudWatch time period.  | 
|  Avail.ObservedFillRate  |  The observed simple average of the rates at which MediaTailor filled individual ad avails within the CloudWatch time period. Emitted only for HLS manifests, at the first `CUE-IN` tag. If there is no `CUE-IN` tag, MediaTailor doesn't emit this metric.   | 
|  Avail.ObservedSlateDuration  |  The observed total number of milliseconds of slate that was inserted within the CloudWatch period.  | 
|  GetManifest.Age  |  The total age of the manifest in milliseconds. Measured from when the origin creates the manifest, to when MediaTailor sends the personalized manifest.  For more information about metrics for measuring manifest age, see [Using metrics to diagnose stale manifests](stale-manifest-diagnose.md).  | 
|  GetManifest.Errors  |  The number of errors received while MediaTailor was generating manifests in the CloudWatch time period that you specified.  | 
|  GetManifest.Latency  |  The MediaTailor response time in milliseconds for the request to generate manifests. For more information about metrics for measuring manifest age, see [Using metrics to diagnose stale manifests](stale-manifest-diagnose.md).  | 
|  GetManifest.MediaTailorAge  |  The amount of time that the manifest has been stored in MediaTailor in milliseconds. Measured from when MediaTailor receives an origin response, to when MediaTailor sends the personalized manifest.  For more information about metrics for measuring manifest age, see [Using metrics to diagnose stale manifests](stale-manifest-diagnose.md).  | 
|  Origin.Age  |  The amount of time that the origin has the manifest in milliseconds. Measured from when the origin creates the manifest, to when MediaTailor sends the origin request.  All `origin.*` metrics are emitted for requests that are fulfilled directly from the origin. They are not emitted for cached origin responses. For more information about metrics for measuring manifest age, see [Using metrics to diagnose stale manifests](stale-manifest-diagnose.md).  | 
|  Origin.Errors  |  The number of non-HTTP 200 status code responses and timed-out responses that MediaTailor received from the origin server in the CloudWatch time period that you specified. All `origin.*` metrics are emitted for requests that are fulfilled directly from the origin. They are not emitted for cached origin responses.  | 
|  Origin.ManifestFileSizeBytes  |  The file size of the origin manifest in bytes for both HLS and DASH. Typically this metric is used in conjunction with `Origin.ManifestFileSizeTooLarge`. All `origin.*` metrics are emitted for requests that are fulfilled directly from the origin. They are not emitted for cached origin responses.  | 
|  Origin.ManifestFileSizeTooLarge  |  The number of responses from the origin that have a manifest size larger than the configured amount. Typically this metric is used in conjunction with `Origin.ManifestFileSizeBytes`. All `origin.*` metrics are emitted for requests that are fulfilled directly from the origin. They are not emitted for cached origin responses.  | 
|  Origin.Timeouts  |  The number of timed-out requests to the origin server in the CloudWatch time period that you specified. All `origin.*` metrics are emitted for requests that are fulfilled directly from the origin. They are not emitted for cached origin responses.  | 
|  Requests  |  The number of concurrent transactions per second across all request types. The transaction count depends mainly on the number of players and how often the players request updated manifests. Each player request counts as a transaction.  | 
|  SkippedReason.DurationExceeded  |  The number of ads that were not inserted into an avail because the ADS returned a duration of ads that was greater than the specified avail duration. A high value for this metric might contribute to a discrepancy between the `Avail.Ads` and `AdDecisionServer.Ads` metric. For more information about ad skipped reasons, see [Ad skipping troubleshooting](troubleshooting-ad-skipping-overview.md).  | 
|  SkippedReason.EarlyCueIn  |  The number of ads skipped due to an early `CUE-IN`.  | 
|  SkippedReason.ImportError  |  The number of ads skipped due to an error in the import job.  | 
|  SkippedReason.ImportInProgress  |  The number of ads skipped due to an existing active import job.  | 
|  SkippedReason.InternalError  |  The number of ads skipped due to a MediaTailor internal error.  | 
|  SkippedReason.NewCreative  |  The number of ads that were not inserted into an avail because it was the first time the asset had been requested by a client. A high value for this metric might temporarily contribute to a low overall `Avail.FillRate`, until assets can be successfully transcoded.  | 
|  SkippedReason.NoVariantMatch  |  The number of ads skipped due to there being no variant match between the ad and content.  | 
|  SkippedReason.PersonalizationThresholdExceeded  |  The duration of ads exceeding the **Personalization Threshold** setting in this configuration.  | 
|  SkippedReason.ProfileNotFound  |  The number of ads skipped due to the transcoding profile not being found.  | 
|  SkippedReason.TranscodeError  |  The number of ads skipped due to a transcode error.  | 
|  SkippedReason.TranscodeInProgress  |  The count of the number of ads that were not inserted into an avail because the ad had not yet been transcoded. A high value for this metric might temporarily contribute to a low overall `Avail.FillRate`, until the assets can be successfully transcoded.  | 
|  GetAssets.Requests  |  The number of Asset List requests received for HLS Interstitials sessions within the CloudWatch time period. Use this metric to monitor late-binding ad decisioning volume and understand the scale of HLS Interstitials usage.  | 
|  GetAssets.Latency  |  The response time for Asset List requests in milliseconds for HLS Interstitials sessions. Monitor this metric to ensure optimal ad decisioning performance and identify potential bottlenecks in the late-binding workflow.  | 

**Note**  
For HLS Interstitials sessions, some metrics behave differently due to the late-binding nature of ad decisioning:  
`Avail.ObservedFilledDuration` matches `Avail.FilledDuration` since MediaTailor cannot observe actual client-side playback behavior.
`Avail.ObservedSlateDuration` reports planned slate duration from Asset List responses rather than observed playback.
Metrics prefixed with "Observed" provide estimated values for HLS Interstitials sessions.

### Simple and weighted averages
<a name="metrics-simple-average"></a>

You can retrieve the simple average and the weighted average for the responses from the ADS to ad requests from MediaTailor and for how MediaTailor fills ad avails: 
+ The *simple averages* are provided in the `AdDecisionServer.FillRate` and the `Avail.FillRate`. These are the averages of the fill rate percentages for the individual avails for the time period. The simple averages don't take into account any differences between the durations of the individual avails.
+ The *weighted averages* are the fill rate percentages for the sum of all avail durations. These are calculated as (`AdDecisionServer.Duration`\$1100)/`Avail.Duration` and (`Avail.FilledDuration`\$1100)/`Avail.Duration`. These averages reflect the differences in duration of each ad avail, giving more weight to those with longer duration. 

For a time period that contains just a single ad avail, the simple average provided by the `AdDecisionServer.FillRate` is equal to the weighted average provided by (`AdDecisionServer.Duration`\$1100)/`Avail.Duration`. The simple average provided by the `Avail.FillRate` is equal to the weighted average provided by (`Avail.FilledDuration`\$1100)/`Avail.Duration`. 

**Example**

Assume the time period that you specified has the following two ad avails:
+ The first ad avail has 90 seconds duration:
  + The ADS response for the avail provides 45 seconds of ads (50% filled). 
  + MediaTailor fills 45 seconds worth of the ad time available (50% filled).
+ The second ad avail has 120 seconds duration: 
  + The ADS response for the avail provides 120 seconds of ads (100% filled). 
  + MediaTailor fills 90 seconds worth of the ad time available (75% filled).

The metrics are as follows: 
+ `Avail.Duration` is 210, the sum of the two ad avail durations: 90 \$1 120.
+ `AdDecisionServer.Duration` is 165, the sum of the two response durations: 45 \$1 120.
+ `Avail.FilledDuration` is 135, the sum of the two filled durations: 45 \$1 90. 
+ `AdDecisionServer.FillRate` is 75%, the average of the percentages filled for each avail: (50% \$1 100%) / 2. This is the simple average.
+ The weighted average for the ADS fill rates is 78.57%, which is `AdDecisionServer.Duration` as a percentage of the `Avail.Duration`: (165\$1100) / 210. This calculation accounts for the differences in the durations. 
+ `Avail.FillRate` is 62.5%, the average of the filled percentages for each avail: (50% \$1 75%) / 2. This is the simple average.
+ The weighted average for the MediaTailor avail fill rates is 64.29%, which is the `Avail.FilledDuration` as a percentage of the `Avail.Duration`: (135\$1100) / 210. This calculation accounts for the differences in the durations. 

The highest `Avail.FillRate` that MediaTailor can attain for any ad avail is 100%. The ADS might return more ad time than is available in the avail, but MediaTailor can only fill the time available. 

## AWS Elemental MediaTailor CloudWatch dimensions
<a name="dimensions"></a>

You can filter the AWS Elemental MediaTailor data using the following dimension.


| Dimension | Description | 
| --- | --- | 
|  `Configuration Name`  |  Indicates the configuration that the metric belongs to.  | 

# Using metrics to diagnose stale manifests from AWS Elemental MediaTailor
<a name="stale-manifest-diagnose"></a>

A stale manifest is one that hasn't been recently updated. Different ad insertion workflows could have varying tolerance to how long must pass before a manifest is considered stale, based on a variety of factors (such as requirements for downstream systems). You can use Amazon CloudWatch metrics to identify manifests that exceed the staleness tolerance for your workflow, and help identify what could be causing the delays in manifest updates. 

The following metrics help identify stale manifests and their causes.

For information about all metrics that MediaTailor emits, see [AWS Elemental MediaTailor CloudWatch metrics](monitoring-cloudwatch-metrics.md#metrics).


| Metric | Definition | Use | 
| --- | --- | --- | 
| GetManifest.Age |  Measures the total age of the manifest, including both `GetManifest.MediaTailorAge` and `Origin.Age` for this configuration.   |  You can use this metric to identify manifests that are past your update threshold and are stale.  Set alarms on this metric so that you're alerted when stale manifests are being served. For information about alarms, see [Alarming on metrics](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/ConsoleAlarms.html) in the *Amazon CloudWatch User Guide.* When you receive an alarm, use `Origin.Age` and `GetManifest.MediaTailorAge` to identify if MediaTailor or the origin is causing the staleness.   | 
| Origin.Age | Measures how long the origin has the manifest before sending it to MediaTailor for this configuration.This metric is not emitted when the response comes from a content delivery network (CDN). The response must come from the origin for `Origin.Age` to be emitted.  |  When you identify stale manifests with `GetManifest.Age`, you can analyze the `Origin.Age` metric and the `GetManifest.MediaTailorAge` metric to determine which is contributing to manifest staleness.  If you find that `Origin.Age` is longer than your typical processing times at the origin, it's likely that the upstream system is causing the issue and you should focus diagnostics there.   | 
| GetManifest.MediaTailorAge | Measures how long MediaTailor has stored this manifest for this configuration. |  When you identify stale manifests with `GetManifest.Age`, you can analyze the `GetManifest.MediaTailorAge` metric and the `Origin.Age` metric to determine which is contributing to manifest staleness.  If the `GetManifest.MediaTailorAge` is longer than your typical manifest personalization times in MediaTailor, it's likely that MediaTailor is causing the issue and you should focus diagnostics there.  `GetManifest.Latency` can further identify how long it takes for MediaTailor to create a personalized manifest.  | 
| GetManifest.Latency | Measures the amount of time it takes for MediaTailor to process the request and create a personalized manifest for this configuration.  |  When you compare `Origin.Age` and `GetManifest.MediaTailorAge` and determine that MediaTailor is the cause of delayed manifest delivery, you can analyze the `GetManifest.Latency` metric to determine if the manifest personalization process is contributing to manifest staleness.  `GetManifest.MediaTailorAge` measures the total time that the manifest is stored in MediaTailor. `GetManifest.Latency` measures how much of that storage time is MediaTailor personalizing the manifest in response to a request.  |