

# Configuring the CloudWatch agent for EC2 instances and on-premises servers
<a name="configure-cloudwatch-ec2-on-premises"></a>

Many organizations run workloads on both physical servers and virtual machines (VMs). These workloads typically run on different OSs that each have unique installation and configuration requirements for capturing and ingesting metrics. 

If you choose to use EC2 instances, you can have a high level of control over your instance and OS configuration. However, this higher level of control and responsibility requires you to monitor and adjust configurations to achieve more efficient usage. You can improve your operational effectiveness by establishing standards for logging and monitoring, and applying a standard installation and configuration approach for capturing and ingesting logs and metrics. 

Organizations that migrate or extend their IT investments to the AWS Cloud can leverage CloudWatch to achieve a unified logging and monitoring solution. CloudWatch pricing means that you incrementally pay for the metrics and logs that you want to capture. You can also capture logs and metrics for on-premises servers by using a similar CloudWatch agent installation process as that for Amazon EC2. 

Before you begin installing and deploying CloudWatch, make sure that you evaluate the logging and metric configurations for your systems and applications. Ensure that you define the standard logs and metrics that you need to capture for the OSs that you want to use. System logs and metrics are the foundation and standard for a logging and monitoring solution because they are generated by the OS and are different for Linux and Windows. There are important metrics and log files available across Linux distributions, in addition to those that are specific to a Linux version or distribution. This variance also occurs between different Windows versions.

## Configuring the CloudWatch agent
<a name="configure-cloudwatch-agent-ec2"></a>

CloudWatch captures metrics and logs for Amazon EC2 and on-premises servers by using [CloudWatch agents and agent configuration files](https://docs.aws.amazon.com//AmazonCloudWatch/latest/monitoring/Install-CloudWatch-Agent.html) that are specific to each OS. We recommend that you define your organization's standard metric and log capture configuration before you begin installing the CloudWatch agent at scale in your accounts. 

You can combine multiple CloudWatch agent configurations to form a composite CloudWatch agent configuration. One recommended approach is to define and divide configurations for your logs and metrics at the system and application level. The following diagram illustrates how multiple CloudWatch configuration file types for different requirements can be combined to form a composite CloudWatch configuration:

![\[Configurations for different requirements are combined to form a composite CloudWatch configuration.\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/implementing-logging-monitoring-cloudwatch/images/logging-monitoring-image-1.png)


These logs and metrics can also be further classified and configured for specific environments or requirements. For example, you could define a smaller subset of logs and metrics with lower precision for unregulated development environments, and a larger, more complete set with higher precision for regulated production environments.

## Configuring log capture for EC2 instances
<a name="log-capture-configuration-ec2"></a>

By default, Amazon EC2 doesn't monitor or capture log files. Instead, log files are captured and ingested into CloudWatch Logs by the CloudWatch agent software installed on your EC2 instance, AWS API, or AWS Command Line Interface (AWS CLI). We recommend using the CloudWatch agent to ingest log files into CloudWatch Logs for Amazon EC2 and on-premises servers. 

You can search and filter logs, as well as extract metrics and run automation based on pattern patching from log files in CloudWatch. CloudWatch supports plaintext, space delimited, and JSON-formatted filter and pattern syntax options, with JSON-formatted logs providing the most flexibility. To increase the filtering and analysis options, you should use a formatted log output instead of plain text.

The CloudWatch agent uses a configuration file that defines the logs and metrics to send to CloudWatch. CloudWatch then captures each log file as a [log stream](https://docs.aws.amazon.com//AmazonCloudWatch/latest/logs/Working-with-log-groups-and-streams.html) and groups these log streams into a [log group](https://docs.aws.amazon.com//AmazonCloudWatch/latest/logs/Working-with-log-groups-and-streams.html). This helps you perform operations across logs from your EC2 instances, such as searching for a matching string.

The default log stream name is the same as the EC2 instance ID and the default log group name is the same as the log file path. The log stream's name must be unique within the CloudWatch log group. You can use `instance_id`, `hostname`, `local_hostname`, or `ip_address` for dynamic substitution in the log stream and log group names, which means that you can use the same CloudWatch agent configuration file across multiple EC2 instances. 

The following diagram shows a CloudWatch agent configuration for capturing logs. The log group is defined by the captured log files and contains separate log streams for each EC2 instance because the `{instance_id}` variable is used for the log stream name and EC2 instance IDs are unique.

![\[A CloudWatch agent configuration for capturing logs.\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/implementing-logging-monitoring-cloudwatch/images/cloudwatch-image-1.png)


Log groups define the retention, tags, security, metric filters, and search scope for the log streams that they contain. The default grouping behavior based on the log file name helps you search, create metrics, and alarm on data that is specific to a log file across EC2 instances in an account and Region. You should evaluate whether further log group refinement is required. For example, your account might be shared by multiple business units and have different technical or operations owners. This means that you must further refine the log group name to reflect the separation and ownership. This approach allows you to concentrate your analysis and troubleshooting on the relevant EC2 instance. 

If multiple environments use one account, you can separate the logging for workloads that run in each environment. The following table shows a log group naming convention that includes the business unit, project or application, and environment.


|  |  | 
| --- |--- |
| Log group name | /<Business unit>/<Project or application name>/<Environment>/<Log file name> | 
| Log stream name | <EC2 instance ID>  | 

You can also group all log files for an EC2 instance into the same log group. This makes it easier to search and analyze across a set of log files for a single EC2 instance. This is useful if most of your EC2 instances service one application or workload and each EC2 instance serves a specific purpose. The following table shows how your log group and log stream naming could be formatted to support this approach.


|  |  | 
| --- |--- |
| Log group name | /<Business unit>/<Project or application name>/<Environment>/<EC2 instance ID> | 
| Log stream name | <Log file name> | 

## Configuring metrics capture for EC2 instances
<a name="metrics-configuration-ec2"></a>

By default, your EC2 instances are enabled for basic monitoring and a [standard set of metrics](https://docs.aws.amazon.com//AWSEC2/latest/UserGuide/viewing_metrics_with_cloudwatch.html) (for example, CPU, network, or storage-related metrics) is automatically sent to CloudWatch every five minutes. CloudWatch metrics can vary depending on the instance family, for example, [burstable performance instances](https://docs.aws.amazon.com//AWSEC2/latest/UserGuide/burstable-performance-instances.html) have metrics for CPU credits. Amazon EC2 standard metrics are included in your instance price. If you enable [detailed monitoring](https://docs.aws.amazon.com//AWSEC2/latest/UserGuide/using-cloudwatch-new.html) for your EC2 instances, you can receive data in one-minute periods. The period frequency impacts your CloudWatch costs, so make sure that you evaluate whether detailed monitoring is required for all or only some of your EC2 instances. For example, you could enable detailed monitoring for production workloads but use basic monitoring for non-production workloads. 

On-premises servers don't include any default metrics for CloudWatch and must use the CloudWatch agent, AWS CLI, or AWS SDK to capture metrics. This means that you must define the metrics that you want to capture (for example, CPU utilization) in the CloudWatch configuration file. You can create a unique CloudWatch configuration file that includes the standard EC2 instance metrics for your on-premises servers and apply it in addition to your standard CloudWatch configuration.

[Metrics](https://docs.aws.amazon.com//AmazonCloudWatch/latest/monitoring/working_with_metrics.html) in CloudWatch are uniquely defined by metric name and zero or more dimensions, and are uniquely grouped in a metric namespace. Metrics provided by an AWS service have a namespace that begins with `AWS` (for example, `AWS/EC2`), and non-AWS metrics are considered custom metrics. Metrics that you configure and capture with the CloudWatch agent are all considered custom metrics. Because the number of created metrics impacts your CloudWatch costs, you should evaluate whether each metric is required for all or only some of your EC2 instances. For example, you could define a complete set of metrics for production workloads but use a smaller subset of these metrics for non-production workloads.

`CWAgent` is the default namespace for metrics published by the CloudWatch agent. Similar to log groups, the metric namespace organizes a set of metrics so that they can be found together in one place. You should modify the namespace to reflect a business unit, project or application, and environment (for example, `/<Business unit>/<Project or application name>/<Environment>`). This approach is useful if multiple unrelated workloads use the same account. You can also correlate your namespace naming convention to your CloudWatch log group naming convention.

Metrics are also identified by their dimensions, which help you analyze them against a set of conditions and are the properties that observations are recorded against. Amazon EC2 includes [separate metrics](https://docs.aws.amazon.com//AWSEC2/latest/UserGuide/viewing_metrics_with_cloudwatch.html#ec2-cloudwatch-dimensions) for EC2 instances with `InstanceId`and `AutoScalingGroupName` dimensions. You also receive metrics with the `ImageId` and `InstanceType` dimensions if you enable detailed monitoring. For example, Amazon EC2 provides a separate EC2 instance metric for the CPU utilization with the `InstanceId` dimensions, in addition to separate CPU utilization metric for the `InstanceType` dimension. This helps you analyze CPU utilization for each unique EC2 instance, in addition to all EC2 instances of a specific [instance type](https://docs.aws.amazon.com//AWSEC2/latest/UserGuide/instance-types.html). 

Adding more dimensions increases your analysis capability but also increases your overall costs, because each metric and unique dimension value combination results in a new metric. For example, if you create a metric for the memory utilization percentage against the `InstanceId` dimension, then this is a new metric for each EC2 instance. If your organization runs thousands of EC2 instances, this causes thousands of metrics and results in higher costs. To control and predict costs, make sure that you determine the metric's cardinality and which dimensions add the most value. For example, you could define a complete set of dimensions for your production workload metrics but a smaller subset of these dimensions for non-production workloads.

You can use the `append_dimensions` property to add dimensions to one or all metrics defined in your CloudWatch configuration. You can also dynamically append the `ImageId`, `InstanceId`, `InstanceType`, and `AutoScalingGroupName` to all metrics in your CloudWatch configuration. Alternatively, you can append an arbitrary dimension name and value for specific metrics by using the `append_dimensions` property on that metric. CloudWatch can also aggregate statistics on metric dimensions that you defined with the `aggregation_dimensions` property. 

For example, you could aggregate the memory used against the `InstanceType` dimension to see the average memory used by all EC2 instances for each instance type. If you use `t2.micro` instances running in a Region, you could determine if workloads using the `t2.micro` class are overutilizing or underutilizing the memory provided. Underutilization might be a sign of workloads using EC2 classes with an unrequired memory capacity. In contrast, overutilization might be a sign of workloads using Amazon EC2 classes with insufficient memory.

The following diagram shows a sample CloudWatch metrics configuration that uses a custom namespace, added dimensions, and aggregation by `InstanceType`.

![\[Example CloudWatch metrics configuration with CloudWatch agent.\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/implementing-logging-monitoring-cloudwatch/images/cloudwatch-image-2.png)


# System-level CloudWatch configuration
<a name="system-level-cloudwatch-configuration"></a>

Systems-level metrics and logs are a central component of a monitoring and logging solution, and the CloudWatch agent has specific configuration options for Windows and Linux. 

We recommend that you use the [CloudWatch configuration file wizard](https://docs.aws.amazon.com//AmazonCloudWatch/latest/monitoring/create-cloudwatch-agent-configuration-file-wizard.html) or configuration file schema to define the CloudWatch agent configuration file for each OS that you plan to support. Additional workload-specific, OS-level logs and metrics can be defined in separate CloudWatch configuration files and appended to the standard configuration. These unique configuration files should be separately stored in an S3 bucket where they can be retrieved by your EC2 instances. An example of an S3 bucket setup for this purpose is described in the [Managing CloudWatch configurations](create-store-cloudwatch-configurations.md#store-cloudwatch-configuration-s3) section of this guide. You can automatically retrieve and apply these configurations using State Manager and Distributor.

## Configuring system-level logs
<a name="system-level-logs"></a>

System-level logs are essential for diagnosing and troubleshooting issues on premises or on the AWS Cloud. Your log capture approach should include any system and security logs generated by the OS. The OS-generated log files might be different depending on the OS version.

The CloudWatch agent supports monitoring Windows event logs by providing the event log name. You can choose which Windows event logs you want to monitor (for example `System`, `Application`, or `Security`).

The system, application, and security logs for Linux systems are typically stored in the `/var/log` directory. The following table defines the common default log files that you should monitor, but you should check the `/etc/rsyslog.conf` or `/etc/syslog.conf` file to determine the specific setup for your system's log files.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/implementing-logging-monitoring-cloudwatch/system-level-cloudwatch-configuration.html)

Your organization might also have other agents or system components that generate logs you want to monitor. You should evaluate and decide which log files are generated by these agents or applications, and include them in your configuration by identifying their file location. For example, you should include the Systems Manager and CloudWatch agent logs in your configuration. The following table provides the location of these agent logs for Windows and Linux. 

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/implementing-logging-monitoring-cloudwatch/system-level-cloudwatch-configuration.html)

CloudWatch ignores a log file if the log file is defined in the CloudWatch agent configuration but isn’t found. This is useful when you want to maintain a single log configuration for Linux, instead of separate configurations for each distribution. It is also useful when a log file doesn’t exist until the agent or software application starts running.

## Configuring system-level metrics
<a name="system-level-metrics"></a>

Memory and disk space utilization aren't included in standard metrics provided by Amazon EC2. To include these metrics, you must install and configure the CloudWatch agent on your EC2 instances. The CloudWatch agent configuration wizard creates a CloudWatch configuration with [predefined metrics](https://docs.aws.amazon.com//AmazonCloudWatch/latest/monitoring/create-cloudwatch-agent-configuration-file-wizard.html#cloudwatch-agent-preset-metrics) and you can add or remove metrics as required. Make sure that you review the predefined metric sets to determine the appropriate level that you require.

End users and workload owners should publish additional system metrics based on specific requirements for a server or EC2 instance. These metric definitions should be stored, versioned, and maintained in a separate CloudWatch agent configuration file, and shared in a central location (for example, Amazon S3) for reuse and automation.

Standard Amazon EC2 metrics are not automatically captured in on-premises servers. These metrics must be defined in a CloudWatch agent configuration file used by the on-premises instances. You can create a separate metric configuration file for on-premises instances with metrics such as CPU utilization, and have these metrics appended to the standard metrics configuration file.

# Application-level CloudWatch configuration
<a name="application-level-configuration"></a>

Application logs and metrics are generated by running applications and are application specific. Make sure that you define the logs and metrics required to adequately monitor applications that are regularly used by your organization. For example, your organization might have standardized on Microsoft Internet Information Server (IIS) for web-based applications. You can create a standard log and metric CloudWatch configuration for IIS that can also be used across your organization. Application-specific configuration files can be stored in a centralized location (for example, an S3 bucket) and are accessed by workload owners or through automated retrieval, and copied to the CloudWatch configuration directory. The CloudWatch agent automatically combines CloudWatch configuration files found in the configuration file directory of each EC2 instance or server into a composite CloudWatch configuration. The end result is a CloudWatch configuration that includes your organization's standard system-level configuration, as well as all relevant application-level CloudWatch configurations. 

Workload owners should identify and configure log files and metrics for all critical applications and components. 

## Configuring application-level logs
<a name="application-logs-configuration"></a>

Application-level logging varies depending on whether the application is a commercial off-the-shelf (COTS) or custom developed application. COTS applications and their components might provide several options for log configuration and output, such as log detail level, log file format, and log file location. However, most COTS or third-party applications don’t allow you to fundamentally change the logging (for example, updating the application's code to include additional log statements or formats that are not configurable). At a minimum, you should configure logging options for COTS or third-party applications to log warning and error-level information, preferably in JSON format.

You can integrate custom-developed applications with CloudWatch Logs by including the application’s log files in your CloudWatch configuration. Custom applications provide better log quality and control because you can customize the log output format, categorize and separate component output to separate log files, in addition to including any additional required details. Make sure that you review and standardize on logging libraries and the required data and formatting for your organization so that analytics and processing become easier.

You can also write to a CloudWatch log stream with the CloudWatch Logs `[PutLogEvents](https://docs.aws.amazon.com//AmazonCloudWatchLogs/latest/APIReference/API_PutLogEvents.html)` API call or by using the AWS SDK. You can use the API or SDK for custom logging requirements, such as coordinating logging to a single log stream across a distributed set of components and servers. However, the easiest to maintain and most widely applicable solution is to configure your applications to write to log files and then use the CloudWatch agent to read and stream the log files to CloudWatch.

You should also consider the kind of metrics that you want to measure from your application log files. You can use metric filters to measure, graph, and alarm on this data in a CloudWatch log group. For example, you can use a metric filter to count failed login attempts by identifying them in your logs. 

You can also create custom metrics for your custom-developed applications by using the [[CloudWatch embedded metric](https://docs.aws.amazon.com//AmazonCloudWatch/latest/monitoring/CloudWatch_Embedded_Metric_Format.html) format](https://docs.aws.amazon.com//AmazonCloudWatch/latest/monitoring/CloudWatch_Embedded_Metric_Format.html) in your application log files.

## Configuring application-level metrics
<a name="application-metrics"></a>

Custom metrics are metrics that aren’t directly provided by AWS services to CloudWatch and they are published in a custom namespace in CloudWatch metrics. All application metrics are considered custom CloudWatch metrics. Application metrics might align to an EC2 instance, application component, API call, or even a business function. You must also consider the importance and cardinality of the dimensions that you choose for your metrics. Dimensions with high cardinality generate a large number of custom metrics and could increase your CloudWatch costs.

CloudWatch helps you capture application-level metrics in multiple ways, including the following:
+ Capture process-level metrics by defining the individual processes that you want to capture from the [ procstat plugin](https://docs.aws.amazon.com//AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-procstat-process-metrics.html). 
+ An application publishes a metric to Windows Performance Monitor and this metric is defined in the CloudWatch configuration.
+ Metric filters and patterns are applied against an application’s logs in CloudWatch.
+ An application writes to a CloudWatch log by using the CloudWatch embedded metric format.
+ An application sends a metric to CloudWatch through the API or AWS SDK.
+ An application sends a metric to a [collectd](https://docs.aws.amazon.com//AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-custom-metrics-collectd.html) or [StatsD](https://docs.aws.amazon.com//AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-custom-metrics-statsd.html) daemon with a configured CloudWatch agent.

You can use procstat to monitor and measure critical application processes with the CloudWatch agent. This helps you to raise an alarm and take action (for example, a notification or restart process) if a critical process is no longer running for your application. You can also measure the performance characteristics of your application processes and raise an alarm if a particular process is acting abnormally. 

Procstat monitoring is also useful if you can't update your COTS applications with additional custom metrics. For example, you can create a `my_process` metric that measures the `cpu_time` and includes a custom `application_version` dimension. You can also use multiple CloudWatch agent configuration files for an application if you have different dimensions for different metrics.

If your application runs on Windows, you should evaluate if it already publishes metrics to Windows Performance Monitor. Many COTS applications integrate with Windows Performance Monitor, which helps you easily monitor application metrics. CloudWatch also integrates with Windows Performance Monitor and you can capture any metrics that are already available in it.

Make sure that you review the logging format and log information provided by your applications to determine which metrics can be extracted with metric filters. You could review historical logs for the application to determine how error messages and abnormal shutdowns are represented. You should also review previously reported issues to determine if a metric could be captured to prevent the issue from recurring. You should also review the application's documentation and ask the application developers to confirm how error messages can be identified.

For custom-developed applications, work with the application's developers to define important metrics that can be implemented by using the CloudWatch embedded metric format, AWS SDK, or AWS API. The recommended approach is to use the embedded metric format. You can use the AWS provided open-source embedded metric format libraries to help you write your statements in the required format. You would also need to update your [application-specific CloudWatch configuration](https://docs.aws.amazon.com//AmazonCloudWatch/latest/monitoring/CloudWatch_Embedded_Metric_Format_Generation_CloudWatch_Agent.html) to include the embedded metric format agent. This causes the agent running on the EC2 instance to act as a local embedded metric format endpoint that sends embedded metric format metrics to CloudWatch.

If your applications already support publishing metrics to collectd or statsd, you can leverage them to ingest metrics into CloudWatch. 