# Monitoring tools
<a name="monitoring-tools"></a>

We recommend that you use observability, monitoring, and alerting tools to:
+ Gain insights into the performance of your Amazon RDS environment
+ Detect unexpected and suspicious behavior
+ Plan capacity and make educated decisions about allocating Amazon RDS instances
+ Analyze metrics and logs to predict potential issues proactively
+ Generate alerts when thresholds are breached in order to troubleshoot and resolve problems before your users are affected

You have different options and solutions to choose from, including AWS-provided, cloud-native observability and monitoring tools and services; free, open-source software solutions; and commercial third-party solutions for monitoring Amazon RDS DB instances. Some of these tools are discussed in the sections that follow.

To determine which tool best suits your needs, compare each tool's features and capabilities against your organization's requirements. We also recommend that you evaluate the tools for ease of deployment, configuration and integration, software updates and maintenance, method of deployment (for example, hardware or serverless), licensing, price, and any other factors that are specific to your organization.

**Sections**
+ [Tools included in Amazon RDS](amazon-rds-tools.md)
+ [CloudWatch namespaces](cloudwatch-namespaces.md)
+ [CloudWatch alarms and dashboards](cloudwatch-dashboards.md)
+ [Amazon RDS Performance Insights](performance-insights-tools.md)
+ [Enhanced Monitoring](enhanced-monitoring.md)
+ [Additional AWS services](aws-monitoring-tools.md)
+ [Third-party monitoring tools](third-party-monitoring-tools.md)

# Tools included in Amazon RDS
<a name="amazon-rds-tools"></a>

Amazon Relational Database Service (Amazon RDS) is a managed database service in the AWS Cloud. Because Amazon RDS is a managed service, it frees you from most management tasks, such as database backups, operating system (OS) and database software installations, OS and software patching, high availability setup, hardware lifecycle, and data center operations. AWS also provides a comprehensive set of tools that enable you to build a complete [observability](https://aws.amazon.com/products/management-and-governance/use-cases/monitoring-and-observability/) solution for your Amazon RDS DB instances.

Some of the monitoring tools are included, preconfigured, and automatically enabled in the Amazon RDS service. Two automated tools are available to you as soon as you start your new Amazon RDS instance:
+ **Amazon RDS instance status** provides details about the current health of your DB instance. For example, status codes include *Available*, *Stopped*, *Creating*, *Backing-up*, and *Failed*. You can use the Amazon RDS console, the AWS Command Line Interface (AWS CLI), or the Amazon RDS API to see instance status. For more information, see [Viewing Amazon RDS DB instance status](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/accessing-monitoring.html#Overview.DBInstance.Status) in the Amazon RDS documentation.
+ **Amazon RDS recommendations** provide automated recommendations for DB instances, read replicas, and DB parameter groups. These recommendations are provided by analyzing DB instance usage, performance data, and configuration, and are delivered as guidance. For example, the *Engine version outdated* recommendation suggests that your DB instances aren't running the latest version of the database software and that you should upgrade your DB instance to benefit from the latest security fixes and other improvements. For more information, see [Viewing Amazon RDS recommendations](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/monitoring-recommendations.html) in the Amazon RDS documentation.

# CloudWatch namespaces
<a name="cloudwatch-namespaces"></a>

Amazon RDS integrates with [Amazon CloudWatch](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch_concepts.html), which is a monitoring and alerting service for cloud resources and applications that run on AWS. Amazon RDS automatically collects metrics, log files, traces, and events about the operation, utilization, performance, and health of DB instances, and sends them to CloudWatch for long-term storage, analysis, and alerting.

Amazon RDS for MySQL and Amazon RDS for MariaDB automatically publish a default set of metrics to CloudWatch in one-minute intervals without additional charge. Those metrics are collected into two *namespaces*, which are containers for metrics:
+ The [AWS/RDS namespace](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/rds-metrics.html#rds-cw-metrics-instance) includes DB instance-level metrics. Examples include `BinLogDiskUsage` (the amount of disk space occupied by binary logs), `CPUUtilization` (the percentage of CPU utilization), `DatabaseConnections` (the number of client network connections to the DB instance), and many more.
+ The [AWS/Usage namespace](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/rds-metrics.html#rds-metrics-usage) includes account-level usage metrics, which are used to determine if you are operating within your [Amazon RDS service quotas](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Limits.html#RDS_Limits.Limits). Examples include `DBInstances` (the number of DB instances in your AWS account or Region), `DBSubnetGroups` (the number of DB subnet groups in your AWS account or Region), and `ManualSnapshots` (the number of manually created database snapshots in your AWS account or Region).

CloudWatch retains metric data as follows:
+ 3 hours: High-resolution custom metrics with a period of less than 60 seconds are retained for 3 hours. After 3 hours, the data points are aggregated into 1-minute period metrics and kept for 15 days.
+ 15 days: Data points with a period of 60 seconds (1 minute) are retained for 15 days. After 15 days, the data points are aggregated into 5-minute period metrics and kept for 63 days.
+ 63 days: Data points with a period of 300 seconds (5 minutes) are retained for 63 days. After 63 days, the data points are aggregated into 1-hour period metrics and kept for 15 months.
+ 15 months: Data points with a period of 3,600 seconds (1 hour) are available for 15 months (455 days).

For more information, see [Metrics](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch_concepts.html#Metric) in the CloudWatch documentation.

# CloudWatch alarms and dashboards
<a name="cloudwatch-dashboards"></a>

You can use [Amazon CloudWatch alarms](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/AlarmThatSendsEmail.html) to watch a specific Amazon RDS metric over a period of time. For example, you can monitor `FreeStorageSpace`, and then perform one or more actions if the value of the metric breaches the threshold that you set. If you set the threshold to 250 MB and the free storage space is 200 MB (less than the threshold), the alarm will be activated and can trigger an action to automatically provision additional storage for the Amazon RDS DB instance. The alarm can also  send a notification SMS to the DBA by using Amazon Simple Notification Service (Amazon SNS). The following diagram illustrates this process.

![\[Using CloudWatch alarms to monitor Amazon RDS metrics\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/amazon-rds-monitoring-alerting/images/cloudwatch-alarms.png)


CloudWatch also provides [dashboards](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Dashboards.html), which you can use to create, customize, interact with, and save customized views (graphs) of the metrics. You can also use [CloudWatch Logs Insights](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AnalyzingLogData.html) to create a dashboard for monitoring the slow query log and error log, and to receive alerts if a specific pattern has been detected in those logs. The following screen shows an example CloudWatch dashboard.

![\[Using CloudWatch dashboards to monitor metrics\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/amazon-rds-monitoring-alerting/images/cloudwatch-dashboard.png)


# Amazon RDS Performance Insights
<a name="performance-insights-tools"></a>

[Amazon RDS Performance Insights](https://aws.amazon.com/rds/performance-insights/) is a database performance tuning and monitoring tool that expands Amazon RDS monitoring features. It helps you analyze the performance of your database by visualizing the DB instance load and filtering the load by waits, SQL statements, hosts, or users. The tool combines multiple metrics into a single interactive graph that helps you identify the type of bottleneck your DB instance might have, such as lock waits, high CPU consumption, or I/O latency, and determine which SQL statements are creating the bottleneck. The following screen shows an example visualization.

![\[Example graph from Amazon RDS Performance Insights\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/amazon-rds-monitoring-alerting/images/performance-insights-example.png)


You have to [enable Performance Insights](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_PerfInsights.Enabling.html) during the DB instance creation process to collect metrics for the Amazon RDS DB instances in your account. The free tier includes seven days of performance data history and one million API requests per month. Optionally, you can purchase longer retention periods. For complete pricing information, see [Performance Insights Pricing](https://aws.amazon.com/rds/performance-insights/pricing/).

For information about how you can use Performance Insights to monitor your DB instances, see the [DB instance monitoring](db-instance-monitoring.md) section later in this guide.

Performance Insights [automatically publishes metrics to CloudWatch](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_PerfInsights.Cloudwatch.html). In addition to using the Performance Insights tool, you can take advantage of the additional features that CloudWatch provides. You can examine the Performance Insights metrics by using the CloudWatch console, the AWS CLI, or the CloudWatch API. You can also add CloudWatch alarms, as with any other metrics. For example, you might want to trigger an SMS notification to DBAs or take a corrective action if the `DBLoad` metric breaches the threshold value you set. You can also add the Performance Insights metrics to your existing CloudWatch dashboards.

# Enhanced Monitoring
<a name="enhanced-monitoring"></a>

[Enhanced Monitoring](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_Monitoring.OS.overview.html) is a tool that captures metrics in real time for the operating system (OS) that your Amazon RDS DB instance runs on. These metrics provide up to one second granularity for CPU, memory, Amazon RDS and OS processes, file system, and disk I/O data, among others. You can access and analyze these metrics in the [Amazon RDS console](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_Monitoring.OS.Viewing.html). As with Performance Insights, Enhanced Monitoring metrics are delivered from Amazon RDS to CloudWatch, where you can benefit from additional features such as the long-term preservation of metrics for analysis, creating metrics filters, displaying graphs on the CloudWatch dashboard, and setting up alarms. By default, Enhanced Monitoring is disabled when you create a new Amazon RDS DB instance. You can [enable](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_Monitoring.OS.Enabling.html) the feature when you create or modify a DB instance. Pricing is based on the amount of data transferred from Amazon RDS to CloudWatch Logs, and storage rates. Depending on the granularity and the number of DB instances where Enhanced Monitoring is enabled, some portion of monitoring data can be included within the CloudWatch Logs free tier. For complete pricing details, see [Amazon CloudWatch Pricing ](https://aws.amazon.com/cloudwatch/pricing/). For more information about the tool, see the [Amazon RDS documentation](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_Monitoring.OS.html) and the [Enhanced Monitoring](https://aws.amazon.com/rds/faqs/#Enhanced_Monitoring) FAQ.

# Additional AWS services
<a name="aws-monitoring-tools"></a>

AWS provides several supporting services, which also integrate with Amazon RDS and CloudWatch, to further enhance the observability of your databases. These include Amazon EventBridge, Amazon CloudWatch Logs, and AWS CloudTrail.
+ [Amazon EventBridge](https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-what-is.html) is a serverless event bus that can receive, filter, transform, route, and deliver events from your applications and AWS resources, including your Amazon RDS DB instances. An *Amazon RDS event* indicates a change in the Amazon RDS environment. For example, when a DB instance changes its status from *Available* to *Stopped*, Amazon RDS generates the event `RDS-EVENT-0087 / The DB instance has been stopped`. Amazon RDS delivers events to CloudWatch Events and EventBridge in near real time. Using EventBridge and CloudWatch Events, you can define rules to send alerts on specific Amazon RDS events of interest and automate actions to be taken when an event matches the rule. A variety of targets are available in response to an event, such as an AWS Lambda function that can perform a corrective action, or an Amazon SNS topic that can send an email or SMS to notify DBAs or DevOps engineers about the event.
+ [Amazon CloudWatch Logs](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/WhatIsCloudWatchLogs.html) is a service that centralizes the storage of log files from all your applications, systems, and AWS services, including Amazon RDS for MySQL and MariaDB DB instances and AWS CloudTrail. If you [enable](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_LogAccess.Concepts.MariaDB.html#USER_LogAccess.MariaDB.PublishtoCloudWatchLogs) the feature for your DB instances, Amazon RDS automatically publishes the following logs to CloudWatch Logs:
  + Error log
  + Slow query log
  + General log
  + Audit log

  You can use CloudWatch Logs Insights to query and analyze the log data. The feature includes a purpose-built query language that helps you search for log events that match patterns, which you define. For example, you can track table corruption in your MySQL DB instance by monitoring the error log file for the following pattern: `"ERROR 1034 (HY000): Incorrect key file for table '*'; try to repair it OR Table * is marked as crashed"`. Filtered log data can be converted into CloudWatch metrics. You can then use the metrics to create dashboards with graphs or tabular data, or set an alarm if the defined threshold value is breached. This is particularly useful when using the audit log, because you can automatically monitor, send alerts, and take corrective actions if any unexpected or suspicious behavior is detected. You can access and manage database logs by using the AWS Management Console, the AWS CLI, the Amazon RDS API, or the AWS SDK for CloudWatch Logs.
+ [AWS CloudTrail](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-user-guide.html) logs and continuously monitors user and API activity in your AWS account. It helps you with auditing, security monitoring, and operational troubleshooting of your Amazon RDS for MySQL or MariaDB DB instances. CloudTrail is integrated with Amazon RDS. All actions can be logged, and CloudTrail provides a record of actions taken by a user, role, or AWS service in Amazon RDS. For example, when a user creates a new Amazon RDS DB instance, an event is detected, and the log includes information about the requested action (`"eventName": "CreateDBInstance"`), the date and time of the action (`"eventTime": "2022-07-30T22:14:06Z"`), request parameters (`"requestParameters": {"dBInstanceIdentifier": "test-instance", "engine": "mysql", "dBInstanceClass": "db.m6g.large"}`), and so on. Events that are logged by CloudTrail include both calls from the Amazon RDS console and calls from code that uses the Amazon RDS API.

# Third-party monitoring tools
<a name="third-party-monitoring-tools"></a>

In some scenarios, in addition to the full suite of cloud-native observability and monitoring tools that AWS provides for Amazon RDS, you might want to use monitoring tools from other software vendors. Such scenarios include hybrid deployments, where you might have a number of databases running in your on-premises data center and another set of databases running in the AWS Cloud. If you have already established your corporate observability solution, you might want to continue using your existing tools and extend them to your AWS Cloud deployments. The challenge in setting up a third-party monitoring solution often lies in the safeguards imposed by Amazon RDS as a cloud-managed service. For example, you cannot install agent software on the host operating system that runs the DB instance, because access to the database host machine is denied. However, you can integrate many third-party monitoring solutions with Amazon RDS by building on top of CloudWatch and other AWS Cloud services. For example, Amazon RDS metrics, logs, events, and traces can be exported and then imported into the third-party monitoring tool for further analysis, visualization, and alerting. Some of these third-party solutions include Prometheus, Grafana, and Percona.

## Prometheus and Grafana
<a name="prometheus-grafana"></a>

[Prometheus](https://prometheus.io/) is an [open-source](https://github.com/prometheus/prometheus) monitoring solution that collects metrics from configured targets at given intervals. It is a general-purpose monitoring solution that can monitor any application or service. When you monitor Amazon RDS DB instances, CloudWatch collects the metrics from Amazon RDS. The metrics are then exported to the Prometheus server by using an open-source exporter such as YACE exporter or CloudWatch Exporter.
+ [YACE exporter](https://promcat.io/apps/aws-rds) optimizes data export tasks by retrieving several metrics in a single request to the CloudWatch API. After the metrics are stored on the Prometheus server, the server evaluates rule expressions and can generate alerts when specified conditions are observed.
+ [CloudWatch Exporter](https://github.com/prometheus/cloudwatch_exporter) is officially maintained by Prometheus. It retrieves CloudWatch metrics through the CloudWatch API and stores them on the Prometheus server in a format that's compatible with Prometheus, by using REST API requests to the HTTP endpoint.

When you choose an exporter, design your deployment model, and configure exporter instances, consider [CloudWatch](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch_limits.html) and [CloudWatch Logs](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/cloudwatch_limits_cwl.html) service and API quotas, because the export of CloudWatch metrics to a Prometheus server is implemented on top of the CloudWatch API. For example, deploying multiple instances of CloudWatch Exporter in a single AWS account and Region to monitor hundreds of Amazon RDS DB instances could result in a throttling error (**ThrottlingException**) and code 400 errors. To overcome such limitations, consider using YACE exporter, which is optimized to collect up to 500 different metrics in a single request. Additionally, to deploy a large number of Amazon RDS DB instances, you should consider using [multiple AWS accounts](https://docs.aws.amazon.com/whitepapers/latest/organizing-your-aws-environment/benefits-of-using-multiple-aws-accounts.html#distribute-aws-service-quotas-and-api-request-rate-limits), instead of centralizing the workload into a single AWS account, and limiting the number of exporter instances in each AWS account.

Alerts are generated by the Prometheus server and handled by [Alertmanager](https://prometheus.io/docs/alerting/latest/alertmanager/). This tool takes care of deduplicating, grouping, and routing alerts to the correct receiver such as email, SMS, or Slack, or initiating an automated response action. Another [open-source](https://github.com/grafana/grafana) tool called [Grafana](https://grafana.com/) displays visualizations for these metrics. Grafana provides rich visualization widgets, such as advanced graphs, dynamic dashboards, and analytics features such as ad-hoc queries and dynamic drilldown. It can also search and analyze logs, and includes alerting features to continuously evaluate metrics and logs, and send notifications when the data matches alert rules.

![\[Using Prometheus and Grafana with Amazon RDS and CloudWatch\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/amazon-rds-monitoring-alerting/images/third-party-tools.png)


## Percona
<a name="percona"></a>

[Percona Monitoring and Management (PMM)](https://docs.percona.com/percona-monitoring-and-management/setting-up/client/aws.html) is a free, [open-source](https://github.com/percona/pmm) database monitoring, management, and observability solution for MySQL and MariaDB. PMM collects thousands of performance metrics from DB instances and their hosts. It provides a web UI to visualize data in dashboards and additional features such as automatic advisors for database health assessments. You can use PMM to monitor Amazon RDS. However, the PMM client (agent) isn't installed on the underlying hosts of the Amazon RDS DB instances, because it doesn't have access to the hosts. Instead, the tool connects to the Amazon RDS DB instances, queries server statistics, `INFORMATION_SCHEMA`, sys schema, and Performance Schema, and uses the CloudWatch API to acquire metrics, logs, events, and traces. PMM requires an AWS Identity and Access Management (IAM) user access key (IAM role) and automatically discovers the Amazon RDS DB instances that are available for monitoring. The PMM tool is profiled for database monitoring and collects more database-specific metrics than Prometheus. To use the [PMM Query Analytics dashboard](https://docs.percona.com/percona-monitoring-and-management/get-started/query-analytics.html), you must configure the Performance Schema as the query source, because the Query Analytics agent isn't installed for Amazon RDS and can't read the slow query log. Instead, it queries the `performance_schema` from the MySQL and MariaDB DB instances directly to obtain metrics. One of the prominent features of PMM is its [ability to alert](https://docs.percona.com/percona-monitoring-and-management/get-started/alerting.html) and advise DBAs on issues that the tool identifies in their databases. PMM offers sets of checks that can detect common security threats, performance degradation, data loss, and data corruption.

In addition to these tools, there are several commercial observability and monitoring solutions available on the market that can integrate with Amazon RDS. Examples include [Datadog Database Monitoring](https://www.datadoghq.com/dg/monitor/rds-benefits/), [Dynatrace Amazon RDS monitoring](https://www.dynatrace.com/technologies/aws-monitoring/amazon-rds-monitoring/), and [AppDynamics Database Monitoring](https://www.appdynamics.com/supported-technologies/database/amazon-rds-monitoring).