

# Troubleshooting Amazon Aurora MySQL database performance
<a name="aurora-mysql-troubleshooting"></a>

This topic focuses on some common Aurora MySQL DB performance issues, and how to troubleshoot or collect information to remediate these issues quickly. We divide database performance into two categories:
+ Server performance – The entire database server runs slower.
+ Query performance – One or more queries take longer to run.

## AWS monitoring options
<a name="aurora-mysql-troubleshooting.monitoring"></a>

We recommend that you use the following AWS monitoring options to help with troubleshooting:
+ Amazon CloudWatch – Amazon CloudWatch monitors your AWS resources and the applications you run on AWS in real time. You can use CloudWatch to collect and track metrics, which are variables you can measure for your resources and applications. For more information, see [What is Amazon CloudWatch?](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/WhatIsCloudWatch.html).

  You can view all of the system metrics and process information for your DB instances on the AWS Management Console. You can configure your Aurora MySQL DB cluster to publish general, slow, audit, and error log data to a log group in Amazon CloudWatch Logs. This allows you to view trends, maintain logs if a host is impacted, and create a baseline for "normal" performance to easily identify anomalies or changes. For more information, see [Publishing Amazon Aurora MySQL logs to Amazon CloudWatch Logs](AuroraMySQL.Integrating.CloudWatch.md).
+ Enhanced Monitoring – To enable additional Amazon CloudWatch metrics for an Aurora MySQL database, turn on Enhanced Monitoring. When you create or modify an Aurora DB cluster, select **Enable Enhanced Monitoring**. This allows Aurora to publish performance metrics to CloudWatch. Some of the key metrics available include CPU usage, database connections, storage usage, and query latency. These can help identify performance bottlenecks.

  The amount of information transferred for a DB instance is directly proportional to the defined granularity for Enhanced Monitoring. A smaller monitoring interval results in more frequent reporting of OS metrics and increases your monitoring cost. To manage costs, set different granularities for different instances in your AWS accounts. The default granularity at creation of an instance is 60 seconds. For more information, see [Cost of Enhanced Monitoring](USER_Monitoring.OS.md#USER_Monitoring.OS.cost).
+ Performance Insights – You can view all of the database call metrics. This includes DB locks, waits, and the number of rows processed, all of which you can use for troubleshooting. When you create or modify an Aurora DB cluster, select **Turn on Performance Insights**. By default, Performance Insights has a 7-day data retention period, but can be customized to analyze longer-term performance trends. For longer than 7-day retention, you need to upgrade to the paid tier. For more information, see [Performance Insights pricing](https://aws.amazon.com/rds/performance-insights/pricing/). You can set the data retention period for each Aurora DB instance separately. For more information, see [Monitoring DB load with Performance Insights on Amazon Aurora](USER_PerfInsights.md).

## Most common reasons for Aurora MySQL database performance issues
<a name="aurora-mysql-troubleshooting-common"></a>

You can use the following steps to troubleshoot performance issues in your Aurora MySQL database. We list these steps in the logical order of investigation, but they're not intended to be linear. One discovery could jump across steps, which allow for a series of investigative paths.

1. [Workload](aurora-mysql-troubleshooting-workload.md) – Understand your database workload.

1. [Logging](aurora-mysql-troubleshooting-logging.md) – Review all of the database logs.

1. [Database connections ](mysql-troubleshooting-dbconn.md) – Make sure that the connections between your applications and your database are reliable.

1. [Query performance](aurora-mysql-troubleshooting-query.md) – Examine your query execution plans to see if they've changed. Code changes can cause plans to change.

# Troubleshooting workload issues for Aurora MySQL databases
<a name="aurora-mysql-troubleshooting-workload"></a>

Database workload can be viewed as reads and writes. With an understanding of "normal" database workload, you can tune queries and the database server to meet demand as it changes. There are a number of different reasons why performance can change, so the first step is to understand what has changed.
+ Has there been a major or minor version upgrade?

  A major version upgrade includes changes to the engine code, particularly in the optimizer, that can change the query execution plan. When upgrading database versions, especially major versions, it's very important that you analyze the database workload and tune accordingly. Tuning can involve optimizing and rewriting queries, or adding and updating parameter settings, depending on the results of testing. Understanding what is causing the impact will allow you to start focusing on that specific area.

  For more information, see [What is new in MySQL 8.0](https://dev.mysql.com/doc/refman/8.0/en/mysql-nutshell.html) and [Server and status variables and options added, deprecated, or removed in MySQL 8.0](https://dev.mysql.com/doc/refman/8.0/en/added-deprecated-removed.html) in the MySQL documentation, and [Comparing Aurora MySQL version 2 and Aurora MySQL version 3](AuroraMySQL.Compare-v2-v3.md).
+ Has there been an increase in data being processed (row counts)?
+ Are there more queries running concurrently?
+ Are there schema or database changes?
+ Have there been code defects or fixes?

**Contents**
+ [

## Instance host metrics
](#ams-workload-instance)
  + [

### CPU usage
](#ams-workload-cpu)
  + [

### Memory usage
](#ams-workload-instance-memory)
  + [

### Network throughput
](#ams-workload-network)
+ [

## Database metrics
](#ams-workload-db)
+ [

# Troubleshooting memory usage issues for Aurora MySQL databases
](ams-workload-memory.md)
  + [

## Example 1: Continuous high memory usage
](ams-workload-memory.md#ams-workload-memory.example1)
  + [

## Example 2: Transient memory spikes
](ams-workload-memory.md#ams-workload-memory.example2)
  + [

## Example 3: Freeable memory drops continuously and isn't reclaimed
](ams-workload-memory.md#ams-workload-memory.example3)
+ [

# Troubleshooting out-of-memory issues for Aurora MySQL databases
](AuroraMySQLOOM.md)

## Instance host metrics
<a name="ams-workload-instance"></a>

Monitor instance host metrics such as CPU, memory, and network activity to help understand whether there has been a workload change. There are two main concepts for understanding workload changes:
+ Utilization – The usage of a device, such as CPU or disk. It can be time-based or capacity-based.
  + Time-based – The amount of time that a resource is busy over a particular observation period.
  + Capacity-based – The amount of throughput that a system or component can deliver, as a percentage of its capacity.
+ Saturation – The degree to which more work is required of a resource than it can process. When capacity-based usage reaches 100%, the extra work can't be processed and must be queued.

### CPU usage
<a name="ams-workload-cpu"></a>

You can use the following tools to identify CPU usage and saturation:
+ CloudWatch provides the `CPUUtilization` metric. If this reaches 100%, then the instance is saturated. However, CloudWatch metrics are averaged over 1 minute, and lack granularity.

  For more information on CloudWatch metrics, see [Instance-level metrics for Amazon Aurora](Aurora.AuroraMonitoring.Metrics.md#Aurora.AuroraMySQL.Monitoring.Metrics.instances).
+ Enhanced Monitoring provides metrics returned by the operating system `top` command. It shows load averages and the following CPU states, with 1-second granularity:
  + `Idle (%)` = Idle time
  + `IRQ (%)` = Software interrupts
  + `Nice (%)` = Nice time for processes with a [niced](https://en.wikipedia.org/wiki/Nice_(Unix)) priority.
  + `Steal (%)` = Time spent serving other tenants (virtualization related)
  + `System (%)` = System time
  + `User (%)` = User time
  + `Wait (%)` = I/O wait

  For more information on Enhanced Monitoring metrics, see [OS metrics for Aurora](USER_Monitoring-Available-OS-Metrics.md#USER_Monitoring-Available-OS-Metrics-RDS).

### Memory usage
<a name="ams-workload-instance-memory"></a>

If the system is under memory pressure, and resource consumption is reaching saturation, you should be observing a high degree of page scanning, paging, swapping, and out-of-memory errors.

You can use the following tools to identify memory usage and saturation:

CloudWatch provides the `FreeableMemory` metric, that shows how much memory can be reclaimed by flushing some of the OS caches and the current free memory.

For more information on CloudWatch metrics, see [Instance-level metrics for Amazon Aurora](Aurora.AuroraMonitoring.Metrics.md#Aurora.AuroraMySQL.Monitoring.Metrics.instances).

Enhanced Monitoring provides the following metrics that can help you identify memory usage issues:
+ `Buffers (KB)` – The amount of memory used for buffering I/O requests before writing to the storage device, in kilobytes.
+ `Cached (KB)` – The amount of memory used for caching file system–based I/O.
+ `Free (KB)` – The amount of unassigned memory, in kilobytes.
+ `Swap` – Cached, Free, and Total.

For example, if you see that your DB instance uses `Swap` memory, then the total amount of memory for your workload is larger than your instance currently has available. We recommend increasing the size of your DB instance or tuning your workload to use less memory.

For more information on Enhanced Monitoring metrics, see [OS metrics for Aurora](USER_Monitoring-Available-OS-Metrics.md#USER_Monitoring-Available-OS-Metrics-RDS).

For more detailed information on using the Performance Schema and `sys` schema to determine which connections and components are using memory, see [Troubleshooting memory usage issues for Aurora MySQL databases](ams-workload-memory.md).

### Network throughput
<a name="ams-workload-network"></a>

CloudWatch provides the following metrics for total network throughput, all averaged over 1 minute:
+ `NetworkReceiveThroughput` – The amount of network throughput received from clients by each instance in the Aurora DB cluster.
+ `NetworkTransmitThroughput` – The amount of network throughput sent to clients by each instance in the Aurora DB cluster.
+ `NetworkThroughput` – The amount of network throughput both received from and transmitted to clients by each instance in the Aurora DB cluster.
+ `StorageNetworkReceiveThroughput` – The amount of network throughput received from the Aurora storage subsystem by each instance in the DB cluster.
+ `StorageNetworkTransmitThroughput` – The amount of network throughput sent to the Aurora storage subsystem by each instance in the Aurora DB cluster.
+ `StorageNetworkThroughput` – The amount of network throughput received from and sent to the Aurora storage subsystem by each instance in the Aurora DB cluster.

For more information on CloudWatch metrics, see [Instance-level metrics for Amazon Aurora](Aurora.AuroraMonitoring.Metrics.md#Aurora.AuroraMySQL.Monitoring.Metrics.instances).

Enhanced Monitoring provides the `network` received (**RX**) and transmitted (**TX**) graphs, with up to 1-second granularity.

For more information on Enhanced Monitoring metrics, see [OS metrics for Aurora](USER_Monitoring-Available-OS-Metrics.md#USER_Monitoring-Available-OS-Metrics-RDS).

## Database metrics
<a name="ams-workload-db"></a>

Examine the following CloudWatch metrics for workload changes:
+ `BlockedTransactions` – The average number of transactions in the database that are blocked per second.
+ `BufferCacheHitRatio` – The percentage of requests that are served by the buffer cache.
+ `CommitThroughput` – The average number of commit operations per second.
+ `DatabaseConnections` – The number of client network connections to the database instance.
+ `Deadlocks` – The average number of deadlocks in the database per second.
+ `DMLThroughput` – The average number of inserts, updates, and deletes per second.
+ `ResultSetCacheHitRatio` – The percentage of requests that are served by the query cache.
+ `RollbackSegmentHistoryListLength` – The undo logs that record committed transactions with delete-marked records.
+ `RowLockTime` – The total time spent acquiring row locks for InnoDB tables.
+ `SelectThroughput` – The average number of select queries per second.

For more information on CloudWatch metrics, see [Instance-level metrics for Amazon Aurora](Aurora.AuroraMonitoring.Metrics.md#Aurora.AuroraMySQL.Monitoring.Metrics.instances).

Consider the following questions when examining the workload:

1. Were there recent changes in DB instance class, for example reducing the instance size from 8xlarge to 4xlarge, or changing from db.r5 to db.r6? 

1. Can you create a clone and reproduce the issue, or is it happening only on that one instance?

1. Is there server resource exhaustion, high CPU or memory exhaustion? If yes, this could mean that additional hardware is required.

1. Are one or more queries taking longer?

1. Are the changes caused by an upgrade, especially a major version upgrade? If yes, then compare the pre- and post-upgrade metrics.

1. Are there changes in the number of reader DB instances?

1. Have you enabled general, audit, or binary logging? For more information, see [Logging for Aurora MySQL databases](aurora-mysql-troubleshooting-logging.md).

1. Did you enable, disable, or change your use of binary log (binlog) replication?

1. Are there any long-running transactions holding large numbers of row locks? Examine the InnoDB history list length (HLL) for indications of long-running transactions.

   For more information, see [The InnoDB history list length increased significantly](proactive-insights.history-list.md) and the blog post [Why is my SELECT query running slowly on my Amazon Aurora MySQL DB cluster?](https://repost.aws/knowledge-center/aurora-mysql-slow-select-query).

   1. If a large HLL is caused by a write transaction, it means that `UNDO` logs are accumulating (not being cleaned regularly). In a large write transaction, this accumulation can grow quickly. In MySQL, `UNDO` is stored in the [SYSTEM tablespace](https://dev.mysql.com/doc/refman/5.7/en/innodb-system-tablespace.html). The `SYSTEM` tablespace is not shrinkable. The `UNDO` log might cause the `SYSTEM` tablespace to grow to several GB, or even TB. After the purge, release the allocated space by taking a logical backup (dump) of the data, then import the dump to a new DB instance.

   1. If a large HLL is caused by a read transaction (long-running query), it can mean that the query is using a large amount of temporary space. Release the temporary space by rebooting. Examine Performance Insights DB metrics for any changes in the `Temp` section, such as `created_tmp_tables`. For more information, see [Monitoring DB load with Performance Insights on Amazon Aurora](USER_PerfInsights.md).

1. Can you split long-running transactions into smaller ones that modify fewer rows?

1. Are there any changes in blocked transactions or increases in deadlocks? Examine Performance Insights DB metrics for any changes in status variables in the `Locks` section, such as `innodb_row_lock_time`, ` innodb_row_lock_waits`, and ` innodb_dead_locks`. Use 1-minute or 5-minute intervals.

1. Are there increased wait events? Examine Performance Insights wait events and wait types using 1-minute or 5-minute intervals. Analyze the top wait events and see whether they are correlated to workload changes or database contention. For example, `buf_pool mutex` indicates buffer pool contention. For more information, see [Tuning Aurora MySQL with wait events](AuroraMySQL.Managing.Tuning.wait-events.md).

# Troubleshooting memory usage issues for Aurora MySQL databases
<a name="ams-workload-memory"></a>

While CloudWatch, Enhanced Monitoring, and Performance Insights provide a good overview of memory usage at the operating system level, such as how much memory the database process is using, they don't allow you to break down what connections or components within the engine might be causing this memory usage.

To troubleshoot this, you can use the Performance Schema and `sys` schema. In Aurora MySQL version 3, memory instrumentation is enabled by default when the Performance Schema is enabled. In Aurora MySQL version 2, only memory instrumentation for Performance Schema memory usage is enabled by default. For information on tables available in the Performance Schema to track memory usage and enabling Performance Schema memory instrumentation, see [Memory summary tables](https://dev.mysql.com/doc/refman/8.3/en/performance-schema-memory-summary-tables.html) in the MySQL documentation. For more information on using the Performance Schema with Performance Insights, see [Overview of the Performance Schema for Performance Insights on Aurora MySQL](USER_PerfInsights.EnableMySQL.md).

While detailed information is available in the Performance Schema to track current memory usage, the MySQL [sys schema](https://dev.mysql.com/doc/refman/8.0/en/sys-schema.html) has views on top of Performance Schema tables that you can use to quickly pinpoint where memory is being used.

In the `sys` schema, the following views are available to track memory usage by connection, component, and query.


| View | Description | 
| --- | --- | 
|  [memory\$1by\$1host\$1by\$1current\$1bytes](https://dev.mysql.com/doc/refman/8.0/en/sys-memory-by-host-by-current-bytes.html)  |  Provides information on engine memory usage by host. This can be useful for identifying which application servers or client hosts are consuming memory.  | 
|  [memory\$1by\$1thread\$1by\$1current\$1bytes](https://dev.mysql.com/doc/refman/8.0/en/sys-memory-by-thread-by-current-bytes.html)  |  Provides information on engine memory usage by thread ID. The thread ID in MySQL can be a client connection or a background thread. You can map thread IDs to MySQL connection IDs by using the [sys.processlist](https://dev.mysql.com/doc/refman/8.0/en/sys-processlist.html) view or [performance\$1schema.threads](https://dev.mysql.com/doc/refman/8.0/en/performance-schema-threads-table.html) table.  | 
|  [memory\$1by\$1user\$1by\$1current\$1bytes](https://dev.mysql.com/doc/refman/8.0/en/sys-memory-by-user-by-current-bytes.html)  |  Provides information on engine memory usage by user. This can be useful for identifying which user accounts or clients are consuming memory.  | 
|  [memory\$1global\$1by\$1current\$1bytes](https://dev.mysql.com/doc/refman/8.0/en/sys-memory-global-by-current-bytes.html)  |  Provides information on engine memory usage by engine component. This can be useful for identifying memory usage globally by engine buffers or components. For example, you might see the `memory/innodb/buf_buf_pool` event for the InnoDB buffer pool, or the `memory/sql/Prepared_statement::main_mem_root` event for prepared statements.  | 
|  [memory\$1global\$1total](https://dev.mysql.com/doc/refman/8.0/en/sys-memory-global-total.html)  |  Provides an overview of total tracked memory usage in the database engine.  | 

In Aurora MySQL version 3.05 and higher, you can also track maximum memory usage by statement digest in the [Performance Schema statement summary tables](https://dev.mysql.com/doc/refman/8.0/en/performance-schema-statement-summary-tables.html). The statement summary tables contain normalized statement digests and aggregated statistics on their execution. The `MAX_TOTAL_MEMORY` column can help you identify maximum memory used by query digest since the statistics were last reset, or since the database instance was restarted. This can be useful in identifying specific queries that might be consuming a lot of memory.

**Note**  
The Performance Schema and `sys` schema show you the current memory usage on the server, and the high-water marks for memory consumed per connection and engine component. Because the Performance Schema is maintained in memory, information is reset when the DB instance restarts. To maintain a history over time, we recommend that you configure retrieval and storage of this data outside of the Performance Schema.

**Topics**
+ [

## Example 1: Continuous high memory usage
](#ams-workload-memory.example1)
+ [

## Example 2: Transient memory spikes
](#ams-workload-memory.example2)
+ [

## Example 3: Freeable memory drops continuously and isn't reclaimed
](#ams-workload-memory.example3)

## Example 1: Continuous high memory usage
<a name="ams-workload-memory.example1"></a>

Looking globally at `FreeableMemory` in CloudWatch, we can see that memory usage greatly increased at 2024-03-26 02:59 UTC.

![\[FreeableMemory graph showing high memory usage.\]](http://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/images/ams-freeable-memory.png)


This doesn't tell us the whole picture. To determine which component is using the most memory, you can log into the database and look at `sys.memory_global_by_current_bytes`. This table contains a list of memory events that MySQL tracks, along with information on memory allocation per event. Each memory tracking event starts with `memory/%`, followed by other information on which engine component/feature the event is associated with.

For example, `memory/performance_schema/%` is for memory events related to the Performance Schema, `memory/innodb/%` is for InnoDB, and so on. For more information on event naming conventions, see [Performance Schema instrument naming conventions](https://dev.mysql.com/doc/refman/8.0/en/performance-schema-instrument-naming.html) in the MySQL documentation.

From the following query, we can find the likely culprit based on `current_alloc`, but we can also see many `memory/performance_schema/%` events.

```
mysql> SELECT * FROM sys.memory_global_by_current_bytes LIMIT 10;

+-----------------------------------------------------------------------------+---------------+---------------+-------------------+------------+------------+----------------+
| event_name                                                                  | current_count | current_alloc | current_avg_alloc | high_count | high_alloc | high_avg_alloc |
+-----------------------------------------------------------------------------+---------------+---------------+-------------------+------------+------------+----------------+
| memory/sql/Prepared_statement::main_mem_root                                |        512817 | 4.91 GiB      | 10.04 KiB         |     512823 | 4.91 GiB   | 10.04 KiB      |
| memory/performance_schema/prepared_statements_instances                     |           252 | 488.25 MiB    | 1.94 MiB          |        252 | 488.25 MiB | 1.94 MiB       |
| memory/innodb/hash0hash                                                     |             4 | 79.07 MiB     | 19.77 MiB         |          4 | 79.07 MiB  | 19.77 MiB      |
| memory/performance_schema/events_errors_summary_by_thread_by_error          |          1028 | 52.27 MiB     | 52.06 KiB         |       1028 | 52.27 MiB  | 52.06 KiB      |
| memory/performance_schema/events_statements_summary_by_thread_by_event_name |             4 | 47.25 MiB     | 11.81 MiB         |          4 | 47.25 MiB  | 11.81 MiB      |
| memory/performance_schema/events_statements_summary_by_digest               |             1 | 40.28 MiB     | 40.28 MiB         |          1 | 40.28 MiB  | 40.28 MiB      |
| memory/performance_schema/memory_summary_by_thread_by_event_name            |             4 | 31.64 MiB     | 7.91 MiB          |          4 | 31.64 MiB  | 7.91 MiB       |
| memory/innodb/memory                                                        |         15227 | 27.44 MiB     | 1.85 KiB          |      20619 | 33.33 MiB  | 1.66 KiB       |
| memory/sql/String::value                                                    |         74411 | 21.85 MiB     |  307 bytes        |      76867 | 25.54 MiB  |  348 bytes     |
| memory/sql/TABLE                                                            |          8381 | 21.03 MiB     | 2.57 KiB          |       8381 | 21.03 MiB  | 2.57 KiB       |
+-----------------------------------------------------------------------------+---------------+---------------+-------------------+------------+------------+----------------+
10 rows in set (0.02 sec)
```

We mentioned previously that the Performance Schema is stored in memory, which means that it's also tracked in the `performance_schema` memory instrumentation.

**Note**  
If you find that the Performance Schema is using a lot of memory, and want to limit its memory usage, you can tune database parameters based on your requirements. For more information, see [The Performance Schema memory-allocation model](https://dev.mysql.com/doc/refman/8.0/en/performance-schema-memory-model.html) in the MySQL documentation.

For readability, you can rerun the same query but exclude Performance Schema events. The output shows the following:
+ The main memory consumer is `memory/sql/Prepared_statement::main_mem_root`.
+ The `current_alloc` column tells us that MySQL has 4.91 GiB currently allocated to this event.
+ The `high_alloc column` tells us that 4.91 GiB is the high-water mark of `current_alloc` since the stats were last reset or since the server restarted. This means that `memory/sql/Prepared_statement::main_mem_root` is at its highest value.

```
mysql> SELECT * FROM sys.memory_global_by_current_bytes WHERE event_name NOT LIKE 'memory/performance_schema/%' LIMIT 10;

+-----------------------------------------------+---------------+---------------+-------------------+------------+------------+----------------+
| event_name                                    | current_count | current_alloc | current_avg_alloc | high_count | high_alloc | high_avg_alloc |
+-----------------------------------------------+---------------+---------------+-------------------+------------+------------+----------------+
| memory/sql/Prepared_statement::main_mem_root  |        512817 | 4.91 GiB      | 10.04 KiB         |     512823 | 4.91 GiB   | 10.04 KiB      |
| memory/innodb/hash0hash                       |             4 | 79.07 MiB     | 19.77 MiB         |          4 | 79.07 MiB  | 19.77 MiB      |
| memory/innodb/memory                          |         17096 | 31.68 MiB     | 1.90 KiB          |      22498 | 37.60 MiB  | 1.71 KiB       |
| memory/sql/String::value                      |        122277 | 27.94 MiB     |  239 bytes        |     124699 | 29.47 MiB  |  247 bytes     |
| memory/sql/TABLE                              |          9927 | 24.67 MiB     | 2.55 KiB          |       9929 | 24.68 MiB  | 2.55 KiB       |
| memory/innodb/lock0lock                       |          8888 | 19.71 MiB     | 2.27 KiB          |       8888 | 19.71 MiB  | 2.27 KiB       |
| memory/sql/Prepared_statement::infrastructure |        257623 | 16.24 MiB     |   66 bytes        |     257631 | 16.24 MiB  |   66 bytes     |
| memory/mysys/KEY_CACHE                        |             3 | 16.00 MiB     | 5.33 MiB          |          3 | 16.00 MiB  | 5.33 MiB       |
| memory/innodb/sync0arr                        |             3 | 7.03 MiB      | 2.34 MiB          |          3 | 7.03 MiB   | 2.34 MiB       |
| memory/sql/THD::main_mem_root                 |           815 | 6.56 MiB      | 8.24 KiB          |        849 | 7.19 MiB   | 8.67 KiB       |
+-----------------------------------------------+---------------+---------------+-------------------+------------+------------+----------------+
10 rows in set (0.06 sec)
```

From the name of the event, we can tell that this memory is being used for prepared statements. If you want to see which connections are using this memory, you can check [memory\$1by\$1thread\$1by\$1current\$1bytes](https://dev.mysql.com/doc/refman/8.0/en/sys-memory-by-thread-by-current-bytes.html).

In the following example, each connection has approximately 7 MiB allocated, with a high-water mark of approximately 6.29 MiB (`current_max_alloc`). This makes sense, because the example is using `sysbench` with 80 tables and 800 connections with prepared statements. If you want to reduce memory usage in this scenario, you can optimize your application's usage of prepared statements to reduce memory consumption.

```
mysql> SELECT * FROM sys.memory_by_thread_by_current_bytes;

+-----------+-------------------------------------------+--------------------+-------------------+-------------------+-------------------+-----------------+
| thread_id | user                                      | current_count_used | current_allocated | current_avg_alloc | current_max_alloc | total_allocated |
+-----------+-------------------------------------------+--------------------+-------------------+-------------------+-------------------+-----------------+
|        46 | rdsadmin@localhost                        |                405 | 8.47 MiB          | 21.42 KiB         | 8.00 MiB          | 155.86 MiB      |
|        61 | reinvent@10.0.4.4                         |               1749 | 6.72 MiB          | 3.93 KiB          | 6.29 MiB          | 14.24 MiB       |
|       101 | reinvent@10.0.4.4                         |               1845 | 6.71 MiB          | 3.72 KiB          | 6.29 MiB          | 14.50 MiB       |
|        55 | reinvent@10.0.4.4                         |               1674 | 6.68 MiB          | 4.09 KiB          | 6.29 MiB          | 14.13 MiB       |
|        57 | reinvent@10.0.4.4                         |               1416 | 6.66 MiB          | 4.82 KiB          | 6.29 MiB          | 13.52 MiB       |
|       112 | reinvent@10.0.4.4                         |               1759 | 6.66 MiB          | 3.88 KiB          | 6.29 MiB          | 14.17 MiB       |
|        66 | reinvent@10.0.4.4                         |               1428 | 6.64 MiB          | 4.76 KiB          | 6.29 MiB          | 13.47 MiB       |
|        75 | reinvent@10.0.4.4                         |               1389 | 6.62 MiB          | 4.88 KiB          | 6.29 MiB          | 13.40 MiB       |
|       116 | reinvent@10.0.4.4                         |               1333 | 6.61 MiB          | 5.08 KiB          | 6.29 MiB          | 13.21 MiB       |
|        90 | reinvent@10.0.4.4                         |               1448 | 6.59 MiB          | 4.66 KiB          | 6.29 MiB          | 13.58 MiB       |
|        98 | reinvent@10.0.4.4                         |               1440 | 6.57 MiB          | 4.67 KiB          | 6.29 MiB          | 13.52 MiB       |
|        94 | reinvent@10.0.4.4                         |               1433 | 6.57 MiB          | 4.69 KiB          | 6.29 MiB          | 13.49 MiB       |
|        62 | reinvent@10.0.4.4                         |               1323 | 6.55 MiB          | 5.07 KiB          | 6.29 MiB          | 13.48 MiB       |
|        87 | reinvent@10.0.4.4                         |               1323 | 6.55 MiB          | 5.07 KiB          | 6.29 MiB          | 13.25 MiB       |
|        99 | reinvent@10.0.4.4                         |               1346 | 6.54 MiB          | 4.98 KiB          | 6.29 MiB          | 13.24 MiB       |
|       105 | reinvent@10.0.4.4                         |               1347 | 6.54 MiB          | 4.97 KiB          | 6.29 MiB          | 13.34 MiB       |
|        73 | reinvent@10.0.4.4                         |               1335 | 6.54 MiB          | 5.02 KiB          | 6.29 MiB          | 13.23 MiB       |
|        54 | reinvent@10.0.4.4                         |               1510 | 6.53 MiB          | 4.43 KiB          | 6.29 MiB          | 13.49 MiB       |
.                                                                                                                                                          .
.                                                                                                                                                          .
.                                                                                                                                                          .
|       812 | reinvent@10.0.4.4                         |               1259 | 6.38 MiB          | 5.19 KiB          | 6.29 MiB          | 13.05 MiB       |
|       214 | reinvent@10.0.4.4                         |               1279 | 6.38 MiB          | 5.10 KiB          | 6.29 MiB          | 12.90 MiB       |
|       325 | reinvent@10.0.4.4                         |               1254 | 6.38 MiB          | 5.21 KiB          | 6.29 MiB          | 12.99 MiB       |
|       705 | reinvent@10.0.4.4                         |               1273 | 6.37 MiB          | 5.13 KiB          | 6.29 MiB          | 13.03 MiB       |
|       530 | reinvent@10.0.4.4                         |               1268 | 6.37 MiB          | 5.15 KiB          | 6.29 MiB          | 12.92 MiB       |
|       307 | reinvent@10.0.4.4                         |               1263 | 6.37 MiB          | 5.17 KiB          | 6.29 MiB          | 12.87 MiB       |
|       738 | reinvent@10.0.4.4                         |               1260 | 6.37 MiB          | 5.18 KiB          | 6.29 MiB          | 13.00 MiB       |
|       819 | reinvent@10.0.4.4                         |               1252 | 6.37 MiB          | 5.21 KiB          | 6.29 MiB          | 13.01 MiB       |
|        31 | innodb/srv_purge_thread                   |              17810 | 3.14 MiB          |  184 bytes        | 2.40 MiB          | 205.69 MiB      |
|        38 | rdsadmin@localhost                        |                599 | 1.76 MiB          | 3.01 KiB          | 1.00 MiB          | 25.58 MiB       |
|         1 | sql/main                                  |               3756 | 1.32 MiB          |  367 bytes        | 355.78 KiB        | 6.19 MiB        |
|       854 | rdsadmin@localhost                        |                 46 | 1.08 MiB          | 23.98 KiB         | 1.00 MiB          | 5.10 MiB        |
|        30 | innodb/clone_gtid_thread                  |               1596 | 573.14 KiB        |  367 bytes        | 254.91 KiB        | 970.69 KiB      |
|        40 | rdsadmin@localhost                        |                235 | 245.19 KiB        | 1.04 KiB          | 128.88 KiB        | 808.64 KiB      |
|       853 | rdsadmin@localhost                        |                 96 | 94.63 KiB         | 1009 bytes        | 29.73 KiB         | 422.45 KiB      |
|        36 | rdsadmin@localhost                        |                 33 | 36.29 KiB         | 1.10 KiB          | 16.08 KiB         | 74.15 MiB       |
|        33 | sql/event_scheduler                       |                  3 | 16.27 KiB         | 5.42 KiB          | 16.04 KiB         | 16.27 KiB       |
|        35 | sql/compress_gtid_table                   |                  8 | 14.20 KiB         | 1.77 KiB          | 8.05 KiB          | 18.62 KiB       |
|        25 | innodb/fts_optimize_thread                |                 12 | 1.86 KiB          |  158 bytes        |  648 bytes        | 1.98 KiB        |
|        23 | innodb/srv_master_thread                  |                 11 | 1.23 KiB          |  114 bytes        |  361 bytes        | 24.40 KiB       |
|        24 | innodb/dict_stats_thread                  |                 11 | 1.23 KiB          |  114 bytes        |  361 bytes        | 1.35 KiB        |
|         5 | innodb/io_read_thread                     |                  1 |  144 bytes        |  144 bytes        |  144 bytes        |  144 bytes      |
|         6 | innodb/io_read_thread                     |                  1 |  144 bytes        |  144 bytes        |  144 bytes        |  144 bytes      |
|         2 | sql/aws_oscar_log_level_monitor           |                  0 |    0 bytes        |    0 bytes        |    0 bytes        |    0 bytes      |
|         4 | innodb/io_ibuf_thread                     |                  0 |    0 bytes        |    0 bytes        |    0 bytes        |    0 bytes      |
|         7 | innodb/io_write_thread                    |                  0 |    0 bytes        |    0 bytes        |    0 bytes        |    0 bytes      |
|         8 | innodb/io_write_thread                    |                  0 |    0 bytes        |    0 bytes        |    0 bytes        |    0 bytes      |
|         9 | innodb/io_write_thread                    |                  0 |    0 bytes        |    0 bytes        |    0 bytes        |    0 bytes      |
|        10 | innodb/io_write_thread                    |                  0 |    0 bytes        |    0 bytes        |    0 bytes        |    0 bytes      |
|        11 | innodb/srv_lra_thread                     |                  0 |    0 bytes        |    0 bytes        |    0 bytes        |    0 bytes      |
|        12 | innodb/srv_akp_thread                     |                  0 |    0 bytes        |    0 bytes        |    0 bytes        |    0 bytes      |
|        18 | innodb/srv_lock_timeout_thread            |                  0 |    0 bytes        |    0 bytes        |    0 bytes        |  248 bytes      |
|        19 | innodb/srv_error_monitor_thread           |                  0 |    0 bytes        |    0 bytes        |    0 bytes        |    0 bytes      |
|        20 | innodb/srv_monitor_thread                 |                  0 |    0 bytes        |    0 bytes        |    0 bytes        |    0 bytes      |
|        21 | innodb/buf_resize_thread                  |                  0 |    0 bytes        |    0 bytes        |    0 bytes        |    0 bytes      |
|        22 | innodb/btr_search_sys_toggle_thread       |                  0 |    0 bytes        |    0 bytes        |    0 bytes        |    0 bytes      |
|        32 | innodb/dict_persist_metadata_table_thread |                  0 |    0 bytes        |    0 bytes        |    0 bytes        |    0 bytes      |
|        34 | sql/signal_handler                        |                  0 |    0 bytes        |    0 bytes        |    0 bytes        |    0 bytes      |
+-----------+-------------------------------------------+--------------------+-------------------+-------------------+-------------------+-----------------+
831 rows in set (2.48 sec)
```

As mentioned earlier, the thread ID (`thd_id`) value here can refer to server background threads or database connections. If you want to map thread ID values to database connection IDs, you can use the `performance_schema.threads` table or the `sys.processlist` view, where `conn_id` is the connection ID.

```
mysql> SELECT thd_id,conn_id,user,db,command,state,time,last_wait FROM sys.processlist WHERE user='reinvent@10.0.4.4';

+--------+---------+-------------------+----------+---------+----------------+------+-------------------------------------------------+
| thd_id | conn_id | user              | db       | command | state          | time | last_wait                                       |
+--------+---------+-------------------+----------+---------+----------------+------+-------------------------------------------------+
|    590 |     562 | reinvent@10.0.4.4 | sysbench | Execute | closing tables |    0 | wait/io/redo_log_flush                          |
|    578 |     550 | reinvent@10.0.4.4 | sysbench | Sleep   | NULL           |    0 | idle                                            |
|    579 |     551 | reinvent@10.0.4.4 | sysbench | Execute | closing tables |    0 | wait/io/redo_log_flush                          |
|    580 |     552 | reinvent@10.0.4.4 | sysbench | Execute | updating       |    0 | wait/io/table/sql/handler                       |
|    581 |     553 | reinvent@10.0.4.4 | sysbench | Execute | updating       |    0 | wait/io/table/sql/handler                       |
|    582 |     554 | reinvent@10.0.4.4 | sysbench | Sleep   | NULL           |    0 | idle                                            |
|    583 |     555 | reinvent@10.0.4.4 | sysbench | Sleep   | NULL           |    0 | idle                                            |
|    584 |     556 | reinvent@10.0.4.4 | sysbench | Execute | updating       |    0 | wait/io/table/sql/handler                       |
|    585 |     557 | reinvent@10.0.4.4 | sysbench | Execute | closing tables |    0 | wait/io/redo_log_flush                          |
|    586 |     558 | reinvent@10.0.4.4 | sysbench | Execute | updating       |    0 | wait/io/table/sql/handler                       |
|    587 |     559 | reinvent@10.0.4.4 | sysbench | Execute | closing tables |    0 | wait/io/redo_log_flush                          |
.                                                                                                                                     .
.                                                                                                                                     .
.                                                                                                                                     .
|    323 |     295 | reinvent@10.0.4.4 | sysbench | Sleep   | NULL           |    0 | idle                                            |
|    324 |     296 | reinvent@10.0.4.4 | sysbench | Execute | updating       |    0 | wait/io/table/sql/handler                       |
|    325 |     297 | reinvent@10.0.4.4 | sysbench | Execute | closing tables |    0 | wait/io/redo_log_flush                          |
|    326 |     298 | reinvent@10.0.4.4 | sysbench | Execute | updating       |    0 | wait/io/table/sql/handler                       |
|    438 |     410 | reinvent@10.0.4.4 | sysbench | Execute | System lock    |    0 | wait/lock/table/sql/handler                     |
|    280 |     252 | reinvent@10.0.4.4 | sysbench | Sleep   | starting       |    0 | wait/io/socket/sql/client_connection            |
|     98 |      70 | reinvent@10.0.4.4 | sysbench | Query   | freeing items  |    0 | NULL                                            |
+--------+---------+-------------------+----------+---------+----------------+------+-------------------------------------------------+
804 rows in set (5.51 sec)
```

Now we stop the `sysbench` workload, which closes the connections and released the memory. Checking the events again, we can confirm that memory is released, but `high_alloc` still tells us what the high-water mark is. The `high_alloc` column can be very useful in identifying short spikes in memory usage, where you might not be able to immediately identify usage from `current_alloc`, which shows only currently allocated memory.

```
mysql> SELECT * FROM sys.memory_global_by_current_bytes WHERE event_name='memory/sql/Prepared_statement::main_mem_root' LIMIT 10;

+----------------------------------------------+---------------+---------------+-------------------+------------+------------+----------------+
| event_name                                   | current_count | current_alloc | current_avg_alloc | high_count | high_alloc | high_avg_alloc |
+----------------------------------------------+---------------+---------------+-------------------+------------+------------+----------------+
| memory/sql/Prepared_statement::main_mem_root |            17 | 253.80 KiB    | 14.93 KiB         |     512823 | 4.91 GiB   | 10.04 KiB      |
+----------------------------------------------+---------------+---------------+-------------------+------------+------------+----------------+
1 row in set (0.00 sec)
```

If you want to reset `high_alloc`, you can truncate the `performance_schema` memory summary tables, but this resets all memory instrumentation. For more information, see [Performance Schema general table characteristics](https://dev.mysql.com/doc/refman/8.0/en/performance-schema-table-characteristics.html) in the MySQL documentation.

In the following example, we can see that `high_alloc` is reset after truncation.

```
mysql> TRUNCATE `performance_schema`.`memory_summary_global_by_event_name`;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT * FROM sys.memory_global_by_current_bytes WHERE event_name='memory/sql/Prepared_statement::main_mem_root' LIMIT 10;

+----------------------------------------------+---------------+---------------+-------------------+------------+------------+----------------+
| event_name                                   | current_count | current_alloc | current_avg_alloc | high_count | high_alloc | high_avg_alloc |
+----------------------------------------------+---------------+---------------+-------------------+------------+------------+----------------+
| memory/sql/Prepared_statement::main_mem_root |            17 | 253.80 KiB    | 14.93 KiB         |         17 | 253.80 KiB | 14.93 KiB      |
+----------------------------------------------+---------------+---------------+-------------------+------------+------------+----------------+
1 row in set (0.00 sec)
```

## Example 2: Transient memory spikes
<a name="ams-workload-memory.example2"></a>

Another common occurrence is short spikes in memory usage on a database server. These can be periodic drops in freeable memory that are difficult to troubleshoot using `current_alloc` in `sys.memory_global_by_current_bytes`, because the memory has already been freed.

**Note**  
If Performance Schema statistics have been reset, or the database instance has been restarted, this information won't be available in `sys` or p`erformance_schema`. To retain this information, we recommend that you configure external metrics collection.

The following graph of the `os.memory.free` metric in Enhanced Monitoring shows brief 7-second spikes in memory usage. Enhanced Monitoring allows you to monitor at intervals as short as 1 second, which is perfect for catching transient spikes like these.

![\[Graph showing transient memory usage spikes over time with periodic pattern indicating potential memory management issues.\]](http://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/images/ams-free-memory-spikes.png)


To help diagnose the cause of the memory usage here, we can use a combination of `high_alloc` in the `sys` memory summary views and [Performance Schema statement summary tables](https://dev.mysql.com/doc/refman/8.0/en/performance-schema-statement-summary-tables.html) to try to identify offending sessions and connections.

As expected, because memory usage isn't currently high, we can't see any major offenders in the `sys` schema view under `current_alloc`.

```
mysql> SELECT * FROM sys.memory_global_by_current_bytes LIMIT 10;

+-----------------------------------------------------------------------------+---------------+---------------+-------------------+------------+------------+----------------+
| event_name                                                                  | current_count | current_alloc | current_avg_alloc | high_count | high_alloc | high_avg_alloc |
+-----------------------------------------------------------------------------+---------------+---------------+-------------------+------------+------------+----------------+
| memory/innodb/hash0hash                                                     |             4 | 79.07 MiB     | 19.77 MiB         |          4 | 79.07 MiB  | 19.77 MiB      |
| memory/innodb/os0event                                                      |        439372 | 60.34 MiB     |  144 bytes        |     439372 | 60.34 MiB  |  144 bytes     |
| memory/performance_schema/events_statements_summary_by_digest               |             1 | 40.28 MiB     | 40.28 MiB         |          1 | 40.28 MiB  | 40.28 MiB      |
| memory/mysys/KEY_CACHE                                                      |             3 | 16.00 MiB     | 5.33 MiB          |          3 | 16.00 MiB  | 5.33 MiB       |
| memory/performance_schema/events_statements_history_long                    |             1 | 14.34 MiB     | 14.34 MiB         |          1 | 14.34 MiB  | 14.34 MiB      |
| memory/performance_schema/events_errors_summary_by_thread_by_error          |           257 | 13.07 MiB     | 52.06 KiB         |        257 | 13.07 MiB  | 52.06 KiB      |
| memory/performance_schema/events_statements_summary_by_thread_by_event_name |             1 | 11.81 MiB     | 11.81 MiB         |          1 | 11.81 MiB  | 11.81 MiB      |
| memory/performance_schema/events_statements_summary_by_digest.digest_text   |             1 | 9.77 MiB      | 9.77 MiB          |          1 | 9.77 MiB   | 9.77 MiB       |
| memory/performance_schema/events_statements_history_long.digest_text        |             1 | 9.77 MiB      | 9.77 MiB          |          1 | 9.77 MiB   | 9.77 MiB       |
| memory/performance_schema/events_statements_history_long.sql_text           |             1 | 9.77 MiB      | 9.77 MiB          |          1 | 9.77 MiB   | 9.77 MiB       |
+-----------------------------------------------------------------------------+---------------+---------------+-------------------+------------+------------+----------------+
10 rows in set (0.01 sec)
```

Expanding the view to order by `high_alloc`, we can now see that the `memory/temptable/physical_ram` component is a very good candidate here. At its highest, it consumed 515.00 MiB.

As its name suggests, `memory/temptable/physical_ram` instruments memory usage for the `TEMP` storage engine in MySQL, which was introduced in MySQL 8.0. For more information on how MySQL uses temporary tables, see [Internal temporary table use in MySQL](https://dev.mysql.com/doc/refman/8.0/en/internal-temporary-tables.html) in the MySQL documentation.

**Note**  
We're using the `sys.x$memory_global_by_current_bytes` view in this example.

```
mysql> SELECT event_name, format_bytes(current_alloc) AS "currently allocated", sys.format_bytes(high_alloc) AS "high-water mark"  
FROM sys.x$memory_global_by_current_bytes ORDER BY high_alloc DESC LIMIT 10;

+-----------------------------------------------------------------------------+---------------------+-----------------+
| event_name                                                                  | currently allocated | high-water mark |
+-----------------------------------------------------------------------------+---------------------+-----------------+
| memory/temptable/physical_ram                                               | 4.00 MiB            | 515.00 MiB      |
| memory/innodb/hash0hash                                                     | 79.07 MiB           | 79.07 MiB       |
| memory/innodb/os0event                                                      | 63.95 MiB           | 63.95 MiB       |
| memory/performance_schema/events_statements_summary_by_digest               | 40.28 MiB           | 40.28 MiB       |
| memory/mysys/KEY_CACHE                                                      | 16.00 MiB           | 16.00 MiB       |
| memory/performance_schema/events_statements_history_long                    | 14.34 MiB           | 14.34 MiB       |
| memory/performance_schema/events_errors_summary_by_thread_by_error          | 13.07 MiB           | 13.07 MiB       |
| memory/performance_schema/events_statements_summary_by_thread_by_event_name | 11.81 MiB           | 11.81 MiB       |
| memory/performance_schema/events_statements_summary_by_digest.digest_text   | 9.77 MiB            | 9.77 MiB        |
| memory/performance_schema/events_statements_history_long.sql_text           | 9.77 MiB            | 9.77 MiB        |
+-----------------------------------------------------------------------------+---------------------+-----------------+
10 rows in set (0.00 sec)
```

In [Example 1: Continuous high memory usage](#ams-workload-memory.example1), we checked the current memory usage for each connection to determine which connection is responsible for using the memory in question. In this example, the memory is already freed, so checking the memory usage for current connections isn't useful.

To dig deeper and find the offending statements, users, and hosts, we use the Performance Schema. The Performance Schema contains multiple statement summary tables that are sliced by different dimensions such as event name, statement digest, host, thread, and user. Each view will allow you dig deeper into where certain statements are being run and what they are doing. This section is focused on `MAX_TOTAL_MEMORY`, but you can find more information on all of the columns available in the [Performance Schema statement summary tables](https://dev.mysql.com/doc/refman/8.0/en/performance-schema-statement-summary-tables.html) documentation.

```
mysql> SHOW TABLES IN performance_schema LIKE 'events_statements_summary_%';

+------------------------------------------------------------+
| Tables_in_performance_schema (events_statements_summary_%) |
+------------------------------------------------------------+
| events_statements_summary_by_account_by_event_name         |
| events_statements_summary_by_digest                        |
| events_statements_summary_by_host_by_event_name            |
| events_statements_summary_by_program                       |
| events_statements_summary_by_thread_by_event_name          |
| events_statements_summary_by_user_by_event_name            |
| events_statements_summary_global_by_event_name             |
+------------------------------------------------------------+
7 rows in set (0.00 sec)
```

First we check `events_statements_summary_by_digest` to see `MAX_TOTAL_MEMORY`.

From this we can see the following:
+ The query with digest `20676ce4a690592ff05debcffcbc26faeb76f22005e7628364d7a498769d0c4a` seems to be a good candidate for this memory usage. The `MAX_TOTAL_MEMORY` is 537450710, which matches the high-water mark we saw for the `memory/temptable/physical_ram` event in `sys.x$memory_global_by_current_bytes`.
+ It has been run four times (`COUNT_STAR`), first at 2024-03-26 04:08:34.943256, and last at 2024-03-26 04:43:06.998310.

```
mysql> SELECT SCHEMA_NAME,DIGEST,COUNT_STAR,MAX_TOTAL_MEMORY,FIRST_SEEN,LAST_SEEN
FROM performance_schema.events_statements_summary_by_digest ORDER BY MAX_TOTAL_MEMORY DESC LIMIT 5;

+-------------+------------------------------------------------------------------+------------+------------------+----------------------------+----------------------------+
| SCHEMA_NAME | DIGEST                                                           | COUNT_STAR | MAX_TOTAL_MEMORY | FIRST_SEEN                 | LAST_SEEN                  |
+-------------+------------------------------------------------------------------+------------+------------------+----------------------------+----------------------------+
| sysbench    | 20676ce4a690592ff05debcffcbc26faeb76f22005e7628364d7a498769d0c4a |          4 |        537450710 | 2024-03-26 04:08:34.943256 | 2024-03-26 04:43:06.998310 |
| NULL        | f158282ea0313fefd0a4778f6e9b92fc7d1e839af59ebd8c5eea35e12732c45d |          4 |          3636413 | 2024-03-26 04:29:32.712348 | 2024-03-26 04:36:26.269329 |
| NULL        | 0046bc5f642c586b8a9afd6ce1ab70612dc5b1fd2408fa8677f370c1b0ca3213 |          2 |          3459965 | 2024-03-26 04:31:37.674008 | 2024-03-26 04:32:09.410718 |
| NULL        | 8924f01bba3c55324701716c7b50071a60b9ceaf17108c71fd064c20c4ab14db |          1 |          3290981 | 2024-03-26 04:31:49.751506 | 2024-03-26 04:31:49.751506 |
| NULL        | 90142bbcb50a744fcec03a1aa336b2169761597ea06d85c7f6ab03b5a4e1d841 |          1 |          3131729 | 2024-03-26 04:15:09.719557 | 2024-03-26 04:15:09.719557 |
+-------------+------------------------------------------------------------------+------------+------------------+----------------------------+----------------------------+
5 rows in set (0.00 sec)
```

Now that we know the offending digest, we can get more details such as the query text, the user who ran it, and where it was run. Based on the digest text returned, we can see that this is a common table expression (CTE) that creates four temporary tables and performs four table scans, which is very inefficient.

```
mysql> SELECT SCHEMA_NAME,DIGEST_TEXT,QUERY_SAMPLE_TEXT,MAX_TOTAL_MEMORY,SUM_ROWS_SENT,SUM_ROWS_EXAMINED,SUM_CREATED_TMP_TABLES,SUM_NO_INDEX_USED
FROM performance_schema.events_statements_summary_by_digest
WHERE DIGEST='20676ce4a690592ff05debcffcbc26faeb76f22005e7628364d7a498769d0c4a'\G;

*************************** 1. row ***************************
           SCHEMA_NAME: sysbench
           DIGEST_TEXT: WITH RECURSIVE `cte` ( `n` ) AS ( SELECT ? FROM `sbtest1` UNION ALL SELECT `id` + ? FROM `sbtest1` ) SELECT * FROM `cte`
     QUERY_SAMPLE_TEXT: WITH RECURSIVE cte (n) AS (   SELECT 1  from sbtest1 UNION ALL   SELECT id + 1 FROM sbtest1) SELECT * FROM cte
      MAX_TOTAL_MEMORY: 537450710
         SUM_ROWS_SENT: 80000000
     SUM_ROWS_EXAMINED: 80000000
SUM_CREATED_TMP_TABLES: 4
     SUM_NO_INDEX_USED: 4
1 row in set (0.01 sec)
```

For more information on the `events_statements_summary_by_digest` table and other Performance Schema statement summary tables, see [Statement summary tables](https://dev.mysql.com/doc/refman/8.0/en/performance-schema-statement-summary-tables.html) in the MySQL documentation.

You can also run an [EXPLAIN](https://dev.mysql.com/doc/refman/8.0/en/explain.html) or [EXPLAIN ANALYZE](https://dev.mysql.com/doc/refman/8.0/en/explain.html#explain-analyze) statement to see more details.

**Note**  
`EXPLAIN ANALYZE` can provide more information than `EXPLAIN`, but it also runs the query, so be careful.

```
-- EXPLAIN
mysql> EXPLAIN WITH RECURSIVE cte (n) AS (SELECT 1  FROM sbtest1 UNION ALL SELECT id + 1 FROM sbtest1) SELECT * FROM cte;

+----+-------------+------------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys | key  | key_len | ref  | rows     | filtered | Extra       |
+----+-------------+------------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
|  1 | PRIMARY     | <derived2> | NULL       | ALL   | NULL          | NULL | NULL    | NULL | 19221520 |   100.00 | NULL        |
|  2 | DERIVED     | sbtest1    | NULL       | index | NULL          | k_1  | 4       | NULL |  9610760 |   100.00 | Using index |
|  3 | UNION       | sbtest1    | NULL       | index | NULL          | k_1  | 4       | NULL |  9610760 |   100.00 | Using index |
+----+-------------+------------+------------+-------+---------------+------+---------+------+----------+----------+-------------+
3 rows in set, 1 warning (0.00 sec)

-- EXPLAIN format=tree 
mysql> EXPLAIN format=tree WITH RECURSIVE cte (n) AS (SELECT 1 FROM sbtest1 UNION ALL SELECT id + 1 FROM sbtest1) SELECT * FROM cte\G;

*************************** 1. row ***************************
EXPLAIN: -> Table scan on cte  (cost=4.11e+6..4.35e+6 rows=19.2e+6)
    -> Materialize union CTE cte  (cost=4.11e+6..4.11e+6 rows=19.2e+6)
        -> Index scan on sbtest1 using k_1  (cost=1.09e+6 rows=9.61e+6)
        -> Index scan on sbtest1 using k_1  (cost=1.09e+6 rows=9.61e+6)
1 row in set (0.00 sec)

-- EXPLAIN ANALYZE 
mysql> EXPLAIN ANALYZE WITH RECURSIVE cte (n) AS (SELECT 1 from sbtest1 UNION ALL SELECT id + 1 FROM sbtest1) SELECT * FROM cte\G;

*************************** 1. row ***************************
EXPLAIN: -> Table scan on cte  (cost=4.11e+6..4.35e+6 rows=19.2e+6) (actual time=6666..9201 rows=20e+6 loops=1)
    -> Materialize union CTE cte  (cost=4.11e+6..4.11e+6 rows=19.2e+6) (actual time=6666..6666 rows=20e+6 loops=1)
        -> Covering index scan on sbtest1 using k_1  (cost=1.09e+6 rows=9.61e+6) (actual time=0.0365..2006 rows=10e+6 loops=1)
        -> Covering index scan on sbtest1 using k_1  (cost=1.09e+6 rows=9.61e+6) (actual time=0.0311..2494 rows=10e+6 loops=1)
1 row in set (10.53 sec)
```

But who ran it? We can see in the Performance Schema that the `destructive_operator` user had `MAX_TOTAL_MEMORY` of 537450710, which again matches the previous results.

**Note**  
The Performance Schema is stored in memory, so should not be relied upon as the sole source for auditing. If you need to maintain a history of statements run, and from which users, we recommend that you enable [Aurora Advanced Auditing](AuroraMySQL.Auditing.md). If you also need to maintain information on memory usage, we recommend that you configure monitoring to export and store these values.

```
mysql> SELECT USER,EVENT_NAME,COUNT_STAR,MAX_TOTAL_MEMORY FROM performance_schema.events_statements_summary_by_user_by_event_name
ORDER BY MAX_CONTROLLED_MEMORY DESC LIMIT 5;

+----------------------+---------------------------+------------+------------------+
| USER                 | EVENT_NAME                | COUNT_STAR | MAX_TOTAL_MEMORY |
+----------------------+---------------------------+------------+------------------+
| destructive_operator | statement/sql/select      |          4 |        537450710 |
| rdsadmin             | statement/sql/select      |       4172 |          3290981 |
| rdsadmin             | statement/sql/show_tables |          2 |          3615821 |
| rdsadmin             | statement/sql/show_fields |          2 |          3459965 |
| rdsadmin             | statement/sql/show_status |         75 |          1914976 |
+----------------------+---------------------------+------------+------------------+
5 rows in set (0.00 sec)

mysql> SELECT HOST,EVENT_NAME,COUNT_STAR,MAX_TOTAL_MEMORY FROM performance_schema.events_statements_summary_by_host_by_event_name
WHERE HOST != 'localhost' AND COUNT_STAR>0 ORDER BY MAX_CONTROLLED_MEMORY DESC LIMIT 5;

+------------+----------------------+------------+------------------+
| HOST       | EVENT_NAME           | COUNT_STAR | MAX_TOTAL_MEMORY |
+------------+----------------------+------------+------------------+
| 10.0.8.231 | statement/sql/select |          4 |        537450710 |
+------------+----------------------+------------+------------------+
1 row in set (0.00 sec)
```

## Example 3: Freeable memory drops continuously and isn't reclaimed
<a name="ams-workload-memory.example3"></a>

The InnoDB database engine employs a range of specialized memory tracking events for different components. These specific events allow for granular tracking of memory usage in key InnoDB subsystems, for example:
+ `memory/innodb/buf0buf` – Dedicated to monitoring memory allocations for the InnoDB buffer pool.
+ `memory/innodb/ibuf0ibuf` – Specifically tracks memory changes related to the InnoDB change buffer.

To identify the top consumers of memory, we can query `sys.memory_global_by_current_bytes`:

```
mysql> SELECT event_name,current_alloc FROM sys.memory_global_by_current_bytes LIMIT 10;

+-----------------------------------------------------------------+---------------+
| event_name                                                      | current_alloc |
+-----------------------------------------------------------------+---------------+
| memory/innodb/memory                                            | 5.28 GiB      |
| memory/performance_schema/table_io_waits_summary_by_index_usage | 495.00 MiB    |
| memory/performance_schema/table_shares                          | 488.00 MiB    |
| memory/sql/TABLE_SHARE::mem_root                                | 388.95 MiB    |
| memory/innodb/std                                               | 226.88 MiB    |
| memory/innodb/fil0fil                                           | 198.49 MiB    |
| memory/sql/binlog_io_cache                                      | 128.00 MiB    |
| memory/innodb/mem0mem                                           | 96.82 MiB     |
| memory/innodb/dict0dict                                         | 96.76 MiB     |
| memory/performance_schema/rwlock_instances                      | 88.00 MiB     |
+-----------------------------------------------------------------+---------------+
10 rows in set (0.00 sec)
```

The results show that `memory/innodb/memory` is the top consumer, using 5.28 GiB of currently allocated memory. This event serves as a category for memory allocations across various InnoDB components not associated with more specific wait events, such as `memory/innodb/buf0buf` mentioned previously.

Having established that InnoDB components are the primary consumers of memory, we can dive deeper into the specifics using the following MySQL command:

```
SHOW ENGINE INNODB STATUS \G;
```

The [SHOW ENGINE INNODB STATUS](https://dev.mysql.com/doc/refman/8.4/en/show-engine.html) command provides a comprehensive status report for the InnoDB storage engine, including detailed memory usage statistics for different InnoDB components. It can help identify which specific InnoDB structures or operations are consuming the most memory. For more information, see [InnoDB in-memory structures](https://dev.mysql.com/doc/refman/8.0/en/innodb-in-memory-structures.html) in the MySQL documentation.

Analyzing the `BUFFER POOL AND MEMORY` section of the InnoDB status report, we see that 5,051,647,748 bytes (4.7 GiB) is allocated to the [dictionary object cache](https://dev.mysql.com/doc/refman/8.0/en/data-dictionary-object-cache.html), which accounts for 89% of the memory tracked by `memory/innodb/memory`.

```
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 0
Dictionary memory allocated 5051647748
Buffer pool size 170512
Free buffers 142568
Database pages 27944
Old database pages 10354
Modified db pages 6
Pending reads 0
```

The dictionary object cache is a shared global cache that stores previously accessed data dictionary objects in memory to enable object reuse and improve performance. The high memory allocation to the dictionary object cache suggests a large number of database objects in the data dictionary cache.

Now that we know that the data dictionary cache is a primary consumer, we proceed to inspect the data dictionary cache for open tables. To find the number of tables in the table definition cache, query the global status variable [open\$1table\$1definitions](https://dev.mysql.com/doc/refman/8.4/en/server-status-variables.html#statvar_Open_table_definitions).

```
mysql> show global status like 'open_table_definitions';

+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| Open_table_definitions | 20000 |
+------------------------+-------+
1 row in set (0.00 sec)
```

For more information, see [How MySQL opens and closes tables](https://dev.mysql.com/doc/refman/8.0/en/table-cache.html) in the MySQL documentation.

You can limit the number of table definitions in the data dictionary cache by limiting the `table_definition_cache` parameter in the DB cluster or DB instance parameter group. For Aurora MySQL, this value serves as a soft limit for the number of tables in the table definition cache. The default value is dependent on the instance class and is set to the following:

```
LEAST({DBInstanceClassMemory/393040}, 20000)
```

When the number of tables exceeds the `table_definition_cache` limit, a least recently used (LRU) mechanism evicts and remove tables from the cache. However, tables involved in foreign key relationships aren't placed in the LRU list, preventing their removal.

In our current scenario, we run [FLUSH TABLES](https://dev.mysql.com/doc/refman/8.4/en/flush.html) to clear the table definition cache. This action results in a significant drop in the [Open\$1table\$1definitions](https://dev.mysql.com/doc/refman/8.0/en/server-status-variables.html#statvar_Open_table_definitions) global status variable, from 20,000 to 12, as shown here:

```
mysql> show global status like 'open_table_definitions';

+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| Open_table_definitions | 12    |
+------------------------+-------+
1 row in set (0.00 sec)
```

Despite this reduction, we observe that the memory allocation for `memory/innodb/memory` remains high at 5.18 GiB, and the dictionary memory allocated also remains unchanged. This is evident from the following query results:

```
mysql> SELECT event_name,current_alloc FROM sys.memory_global_by_current_bytes LIMIT 10;

+-----------------------------------------------------------------+---------------+
| event_name                                                      | current_alloc |
+-----------------------------------------------------------------+---------------+
| memory/innodb/memory                                            | 5.18 GiB      |
| memory/performance_schema/table_io_waits_summary_by_index_usage | 495.00 MiB    |
| memory/performance_schema/table_shares                          | 488.00 MiB    |
| memory/sql/TABLE_SHARE::mem_root                                | 388.95 MiB    |
| memory/innodb/std                                               | 226.88 MiB    |
| memory/innodb/fil0fil                                           | 198.49 MiB    |
| memory/sql/binlog_io_cache                                      | 128.00 MiB    |
| memory/innodb/mem0mem                                           | 96.82 MiB     |
| memory/innodb/dict0dict                                         | 96.76 MiB     |
| memory/performance_schema/rwlock_instances                      | 88.00 MiB     |
+-----------------------------------------------------------------+---------------+
10 rows in set (0.00 sec)
```

```
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 0
Dictionary memory allocated 5001599639
Buffer pool size 170512
Free buffers 142568
Database pages 27944
Old database pages 10354
Modified db pages 6
Pending reads 0
```

This persistently high memory usage can be attributed to tables involved in foreign key relationships. These tables aren't placed in the LRU list for removal, explaining why the memory allocation remains high even after flushing the table definition cache.

To address this issue:

1. Review and optimize your database schema, particularly foreign key relationships.

1. Consider moving to a larger DB instance class that has more memory to accommodate your dictionary objects.

By following these steps and understanding the memory allocation patterns, you can better manage memory usage in your Aurora MySQL DB instance and prevent potential performance issues due to memory pressure.

# Troubleshooting out-of-memory issues for Aurora MySQL databases
<a name="AuroraMySQLOOM"></a>

The Aurora MySQL `aurora_oom_response` instance-level parameter can enable the DB instance to monitor the system memory and estimate the memory consumed by various statements and connections. If the system runs low on memory, it can perform a list of actions to attempt to release that memory. It does so in an attempt to avoid a database restart due to out-of-memory (OOM) issues. The instance-level parameter takes a string of comma-separated actions that a DB instance performs when its memory is low. The `aurora_oom_response` parameter is supported for Aurora MySQL versions 2 and 3.

The following values, and combinations of them, can be used for the `aurora_oom_response` parameter. An empty string means that no action is taken, and effectively turns off the feature, leaving the database prone to OOM restarts.
+ `decline` – Declines new queries when the DB instance is low on memory.
+ `kill_connect` – Closes database connections that are consuming a large amount of memory, and ends current transactions and Data Definition Language (DDL) statements. This response isn't supported for Aurora MySQL version 2.

  For more information, see [KILL statement](https://dev.mysql.com/doc/refman/8.0/en/kill.html) in the MySQL documentation.
+ `kill_query` – Ends queries in descending order of memory consumption until the instance memory surfaces above the low threshold. DDL statements aren't ended.

  For more information, see [KILL statement](https://dev.mysql.com/doc/refman/8.0/en/kill.html) in the MySQL documentation.
+ `print` – Only prints the queries that are consuming a large amount of memory.
+ `tune` – Tunes the internal table caches to release some memory back to the system. Aurora MySQL decreases the memory used for caches such as `table_open_cache` and `table_definition_cache` in low-memory conditions. Eventually, Aurora MySQL sets their memory usage back to normal when the system is no longer low on memory.

  For more information, see [table\$1open\$1cache](https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_table_open_cache) and [table\$1definition\$1cache](https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_table_definition_cache) in the MySQL documentation.
+ `tune_buffer_pool` – Decreases the size of the buffer pool to release some memory and make it available for the database server to process connections. This response is supported for Aurora MySQL version 3.06 and higher.

  You must pair `tune_buffer_pool` with either `kill_query` or `kill_connect` in the `aurora_oom_response` parameter value. If not, buffer pool resizing won't happen, even when you include `tune_buffer_pool` in the parameter value.

In Aurora MySQL versions lower than 3.06, for DB instance classes with memory less than or equal to 4 GiB, when the instance is under memory pressure, the default actions include `print`, `tune`, `decline`, and `kill_query`. For DB instance classes with memory greater than 4 GiB, the parameter value is empty by default (disabled).

In Aurora MySQL version 3.06 and higher, for DB instance classes with memory less than or equal to 4 GiB, Aurora MySQL also closes the top memory-consuming connections (`kill_connect`). For DB instance classes with memory greater than 4 GiB, the default parameter value is `print`.

In Aurora MySQL version 3.09 and higher, for DB instance classes with memory greater than 4 GiB, the default parameter value is `print,decline,kill_connect`.

If you frequently run into out-of-memory issues, memory usage can be monitored using [memory summary tables](https://dev.mysql.com/doc/refman/8.3/en/performance-schema-memory-summary-tables.html) when `performance_schema` is enabled.

For Amazon CloudWatch metrics related to OOM, see [Instance-level metrics for Amazon Aurora](Aurora.AuroraMonitoring.Metrics.md#Aurora.AuroraMySQL.Monitoring.Metrics.instances). For global status variables related to OOM, see [Aurora MySQL global status variables](AuroraMySQL.Reference.GlobalStatusVars.md).

# Logging for Aurora MySQL databases
<a name="aurora-mysql-troubleshooting-logging"></a>

Aurora MySQL logs provide essential information about database activity and errors. By enabling these logs, you can identify and troubleshoot issues, understand database performance, and audit database activity. We recommend that you enable these logs for all of your Aurora MySQL DB instances to ensure optimal performance and availability of the databases. The following types of logging can be enabled. Each log contains specific information that can lead to uncovering impacts to database processing.
+ Error – Aurora MySQL writes to the error log only on startup, shutdown, and when it encounters errors. A DB instance can go hours or days without new entries being written to the error log. If you see no recent entries, it's because the server didn't encounter an error that would result in a log entry. Error logging is enabled by default. For more information, see [Aurora MySQL error logs](USER_LogAccess.MySQL.LogFileSize.md#USER_LogAccess.MySQL.Errorlog).
+ General – The general log provides detailed information about database activity, including all SQL statements executed by the database engine. For more information on enabling general logging and setting logging parameters, see [Aurora MySQL slow query and general logs](USER_LogAccess.MySQL.LogFileSize.md#USER_LogAccess.MySQL.Generallog), and [The general query log](https://dev.mysql.com/doc/refman/8.0/en/query-log.html) in the MySQL documentation.
**Note**  
General logs can grow to be very large and consume your storage. For more information, see [Log rotation and retention for Aurora MySQL](USER_LogAccess.MySQL.LogFileSize.md#USER_LogAccess.AMS.LogFileSize.retention).
+ Slow query – The slow query log consists of SQL statements that take more than [long\$1query\$1time](https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_long_query_time) seconds to run and require at least [min\$1examined\$1row\$1limit](https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_min_examined_row_limit) rows to be examined. You can use the slow query log to find queries that take a long time to run and are therefore candidates for optimization.

  The default value for `long_query_time` is 10 seconds. We recommend that you start with a high value to identify the slowest queries, then work your way down for fine tuning.

  You can also use related parameters, such as `log_slow_admin_statements` and `log_queries_not_using_indexes`. Compare `rows_examined` with `rows_returned`. If `rows_examined` is much greater than `rows_returned`, then those queries can potentially be blocking.

  In Aurora MySQL version 3, you can enable `log_slow_extra` to obtain more details. For more information, see [Slow query log contents](https://dev.mysql.com/doc/refman/8.0/en/slow-query-log.html#slow-query-log-contents) in the MySQL documentation. You can also modify `long_query_time` at the session level for debugging query execution interactively, which is especially useful if `log_slow_extra` is enabled globally.

  For more information on enabling slow query logging and setting logging parameters, see [Aurora MySQL slow query and general logs](USER_LogAccess.MySQL.LogFileSize.md#USER_LogAccess.MySQL.Generallog), and [The slow query log](https://dev.mysql.com/doc/refman/8.0/en/slow-query-log.html) in the MySQL documentation.
+ Audit – The audit log monitors and logs database activity. Audit logging for Aurora MySQL is called Advanced Auditing. To enable Advanced Auditing, you set certain DB cluster parameters. For more information, see [Using Advanced Auditing with an Amazon Aurora MySQL DB cluster](AuroraMySQL.Auditing.md).
+ Binary – The binary log (binlog) contains events that describe database changes, such as table creation operations and changes to table data. It also contains events for statements that potentially could have made changes (for example, a [DELETE](https://dev.mysql.com/doc/refman/8.0/en/delete.html) that matched no rows), unless row-based logging is used. The binary log also contains information about how long each statement took that updated data.

  Running a server with binary logging enabled makes performance slightly slower. However, the benefits of the binary log in enabling you to set up replication and for restore operations generally outweigh this minor performance decrease.
**Note**  
Aurora MySQL doesn't require binary logging for restore operations.

  For more information on enabling binary logging and setting the binlog format, see [Configuring Aurora MySQL binary logging for Single-AZ databases](USER_LogAccess.MySQL.BinaryFormat.md), and [The binary log](https://dev.mysql.com/doc/refman/8.0/en/binary-log.html) in the MySQL documentation.

You can publish the error, general, slow, query, and audit logs to Amazon CloudWatch Logs. For more information, see [Publishing database logs to Amazon CloudWatch Logs](USER_LogAccess.Procedural.UploadtoCloudWatch.md).

Another useful tool for summarizing slow, general, and binary log files is [pt-query-digest](https://docs.percona.com/percona-toolkit/pt-query-digest.html).

# Troubleshooting connection issues for Aurora MySQL databases
<a name="mysql-troubleshooting-dbconn"></a>

Ensuring reliable connectivity between your applications and your RDS DB instance is crucial for the smooth operation of your workloads. However, connectivity issues can arise because of various factors, such as network configurations, authentication problems, or resource constraints. This guide aims to provide a comprehensive approach to troubleshooting connectivity issues with Aurora MySQL.

**Contents**
+ [

## Identifying database connectivity issues for Aurora MySQL
](#mysql-dbconn-identify)
+ [

## Gathering data on connectivity issues for Aurora MySQL
](#mysql-dbconn-gather)
+ [

## Monitoring database connections for Aurora MySQL
](#mysql-dbconn-monitor)
  + [

### Additional monitoring for Aurora MySQL
](#mysql-dbconn-monitor-ams)
+ [

## Connectivity error codes for Aurora MySQL
](#mysql-dbconn-errors)
+ [

## Parameter tuning recommendations for Aurora MySQL
](#mysql-dbconn-params)
+ [

## Examples of troubleshooting database connection issues for Aurora MySQL
](#mysql-dbconn-examples)
  + [

### Example 1: Troubleshooting failed connection attempts
](#mysql-dbconn-example1)
  + [

### Example 2: Troubleshooting abnormal client disconnects
](#mysql-dbconn-example2)
  + [

### Example 3: Troubleshooting IAM failed connection attempts
](#mysql-dbconn-example3)

## Identifying database connectivity issues for Aurora MySQL
<a name="mysql-dbconn-identify"></a>

Identifying the specific category of the connectivity issue can help narrow down the potential causes and guide the troubleshooting process. Each category might require different approaches and techniques for diagnosis and resolution. Database connectivity issues can broadly be classified into the following categories.

**Connection errors and exceptions**  
Connection errors and exceptions can occur for various reasons, such as incorrect connection strings, authentication failures, network disruptions, or database server issues. Causes can include misconfigured connection parameters, invalid credentials, network outages, or database server crashes or restarts. Misconfigured security groups, virtual private cloud (VPC) settings, network Access Control Lists (ACLs), and route tables associated with subnets can also lead to connection issues.

**Connection limit reached**  
This issue arises when the number of concurrent connections to the database server exceeds the maximum allowed limit. Database servers typically have a configurable maximum connection limit defined by the parameter max\$1connections in the clusters and instance parameter groups. By imposing a connection limit, the database server ensures that it has sufficient resources (for example, memory, CPU, and file handles) to handle the existing connections efficiently and provide acceptable performance. Causes can include connection leaks in the application, inefficient connection pooling, or an unexpected surge in connection requests.

**Connection timeouts**  
Connection timeouts occur when the client application is unable to establish a connection with the database server within a specified timeout period. Common causes include network issues, server overload, firewall rules, and misconfigured connection settings.

**Idle connection timeouts**  
Idle connections that remain inactive for a prolonged period might be closed automatically by the database server to conserve resources. This timeout is typically configurable using the `wait_timeout` and `interactive_timeout parameters`, and should be adjusted based on the application's connection usage patterns. Causes can include application logic that leaves connections idle for extended periods, or improper connection management.

**Intermittent disconnect of existing connections**  
This class of errors refers to a scenario where established connections between a client application and the database are unexpectedly terminated or disconnected at irregular intervals, despite being active and in use. These disconnections occur intermittently, meaning they happen at irregular intervals and not consistently. The causes can include the following:  
+ Database server issues such as restarts or failovers
+ Improper application connection handling
+ Load balancing and proxy issues
+ Network instability
+ Problems with third-party components or middleware involved in the connection path
+ Query execution timeouts
+ Resource constraints on the server or client side
Identifying the root cause through comprehensive monitoring, logging, and analysis is crucial, while implementing proper error handling, connection pooling, and retry mechanisms can help mitigate the impact of these intermittent disconnections on the application's functionality and user experience.

## Gathering data on connectivity issues for Aurora MySQL
<a name="mysql-dbconn-gather"></a>

Gathering comprehensive data related to the application, database, network, and infrastructure components is crucial for effectively troubleshooting connectivity issues between an application and an Aurora MySQL database. By collecting relevant logs, configurations, and diagnostic information, you gain valuable insights that can help identify the root cause of the connectivity problems and guide you towards an appropriate resolution.

Network logs and configurations, such as security group rules, VPC settings, and route tables, are essential for identifying potential network-related bottlenecks or misconfigurations that could be preventing the application from establishing a successful connection with the database. By analyzing these network components, you can make sure that the necessary ports are open, IP addresses are allowed, and routing configurations are set up correctly.

**Timestamps**  
Record the exact timestamps when the connectivity issues occur. This can help identify patterns or correlate the issues with other events or activities.

**DB engine logs**  
In addition to the general database logs, review the database engine logs (for example, the MySQL error log and slow query log) for any relevant information or errors that might be related to the intermittent connectivity issues. For more information, see [Logging for Aurora MySQL databases](aurora-mysql-troubleshooting-logging.md).

**Client application logs**  
Collect detailed logs from the client applications that connect to the database. Application logs provide visibility into the connection attempts, errors, and any relevant information from the application's perspective, which can reveal issues related to connection strings, authentication credentials, or application-level connection handling.  
Database logs, on the other hand, offer insights into database-side errors, slow queries, or events that might be contributing to the connectivity issues. For more information, see [Logging for Aurora MySQL databases](aurora-mysql-troubleshooting-logging.md).

**Client environment variables**  
Check whether any environment variables or configuration settings on the client side might be affecting the database connection, such as proxy settings, SSL/TLS settings, or any other relevant variables.

**Client library versions**  
Make sure that the client is using the latest versions of any database drivers, libraries, or frameworks used for database connectivity. Outdated versions can have known issues or compatibility problems.

**Client network capture**  
Perform a network capture on the client side using a tool such as Wireshark or `tcpdump` during the times when connectivity issues occur. This can help identify any network-related issues or anomalies on the client side.

**Client network topology**  
Understand the client's network topology, including any firewalls, load balancers, or other components such as RDS Proxy or Proxy SQL that are making connections to the database instead of the client directly making connections.

**Client operating system settings**  
Check the client's operating system settings that might affect network connectivity, such as firewall rules, network adapter settings, and any other relevant settings.

**Connection pooling configuration**  
If you're using a connection pooling mechanism in your application, review the configuration settings and monitor the pool metrics (for example, active connections, idle connections, and connection timeouts) to ensure that the pool is functioning correctly. Also review the pool settings, such as the maximum pool size, minimum pool size, and connection validation settings, to ensure that they are configured correctly.

**Connection string**  
The connection string typically includes parameters such as the hostname or endpoint, port number, database name, and authentication credentials. Analyzing the connection string can help identify potential misconfigurations or incorrect settings that may be causing connectivity problems. For example, an incorrect hostname or port number can prevent the client from reaching the database instance, while invalid authentication credentials can lead to authentication failures and connection rejections. Additionally, the connection string can reveal issues related to connection pooling, timeouts, or other connection-specific settings that could contribute to connectivity issues. Providing the complete connection string used by the client application can help pinpoint any misconfigurations on the client.

**Database metrics**  
Monitor database metrics such as CPU usage, memory usage, and disk I/O during the times when connectivity issues occur. These can help identify whether the DB instance is experiencing resource contention or performance issues.

**DB engine version**  
Note the Aurora MySQL DB engine version. AWS regularly releases updates addressing known issues, security vulnerabilities, and introducing performance enhancements. Therefore, we highly recommend that you upgrade to the latest available versions, as these updates often include bug fixes and improvements specifically related to connectivity, performance, and stability. Providing the database version information, along with the other collected details, can assist Support in effectively diagnosing and resolving connectivity issues.

**Network metrics**  
Collect network metrics such as latency, packet loss, and throughput during the times when connectivity issues occur. Tools such as `ping`, `traceroute`, and network monitoring tools can help gather this data.

**Source and client details**  
Determine the IP addresses of the application servers, load balancers, or any other components that are initiating the database connections. This could be a single IP address or a range of IP addresses (CIDR notation). If the source is an Amazon EC2 instance, it also helps to review the instance type, Availability Zone, subnet ID , and security groups associated with the instance, and network interface details such as private IP address and public IP address.

By thoroughly analyzing the gathered data, you can identify misconfigurations, resource constraints, network disruptions, or other underlying issues that are causing the intermittent or persistent connectivity problems. This information allows you to take targeted actions, such as adjusting configurations, resolving network issues, or addressing application-level connection handling.

## Monitoring database connections for Aurora MySQL
<a name="mysql-dbconn-monitor"></a>

To monitor and troubleshoot connectivity issues, you can use the following metrics and features.

**CloudWatch metrics**  
+ `CPUUtilization` – High CPU usage on the DB instance can lead to slow query execution, which can result in connection timeouts or rejections.
+ `DatabaseConnections` – Monitor the number of active connections to the DB instance. A high number of connections close to the configured maximum can indicate potential connectivity issues or connection pool exhaustion.
+ `FreeableMemory` – Low available memory can cause performance issues and connectivity problems because of resource constraints.
+ `NetworkReceiveThroughput` and `NetworkTransmitThroughput` – Unusual spikes or drops in network throughput can indicate connectivity issues or network bottlenecks.

**Performance Insights metrics**  
To troubleshoot connectivity issues in Aurora MySQL using Performance Insights, analyze Database metrics such as the following:  
+ Aborted\$1clients
+ Aborted\$1connects
+ Connections
+ max\$1connections
+ Threads\$1connected
+ Threads\$1created
+ Threads\$1running
These metrics can help you to identify connection bottlenecks, detect network or authentication problems, optimize connection pooling, and ensure efficient thread management. For more information, see [Performance Insights counters for Aurora MySQL](USER_PerfInsights_Counters.md#USER_PerfInsights_Counters.Aurora_MySQL).

**Performance Insights features**  
+ **Database Load** – Visualize the database load over time and correlate it with connectivity issues or performance degradation.
+ **SQL Statistics** – Analyze SQL statistics to identify inefficient queries or database operations that might contribute to connectivity problems.
+ **Top Queries** – Identify and analyze the most resource-intensive queries, which can help identify potential performance bottlenecks or long-running queries that may be causing connectivity issues.

By monitoring these metrics and leveraging Performance Insights, you can gain visibility into the database instance's performance, resource usage, and potential bottlenecks that might be causing connectivity issues. For example:
+ High `DatabaseConnections` close to the maximum limit can indicate connection pool exhaustion or improper connection handling, leading to connectivity problems.
+ High `CPUUtilization` or low `FreeableMemory` can indicate resource constraints, which may cause slow query execution and connection timeouts or rejections.
+ Analyzing the **Top Queries** and **SQL Statistics** can help identify inefficient or resource-intensive queries that may be contributing to connectivity issues.

Additionally, monitoring CloudWatch Logs and setting up alarms can help you proactively identify and respond to connectivity problems before they escalate.

It's important to note that while these metrics and tools can provide valuable insights, they should be used in conjunction with other troubleshooting steps. By also reviewing network configurations, security group rules, and application-level connection handling, you can comprehensively diagnose and resolve connectivity issues with Aurora MySQL DB instances.

### Additional monitoring for Aurora MySQL
<a name="mysql-dbconn-monitor-ams"></a>

**CloudWatch metrics**  
+ `AbortedClients` – Tracks the number of client connections that have not been closed properly.
+ `AuroraSlowConnectionHandleCount` – Tracks the number of slow connection handle operations, indicating potential connectivity issues or performance bottlenecks.
+ `AuroraSlowHandshakeCount` – Measures the number of slow handshake operations, which can also be an indicator of connectivity problems.
+ `ConnectionAttempts` – Measures the number of connection attempts made to the Aurora MySQL DB instance.

**Global status variables**  
`Aurora_external_connection_count` – Shows the number of database connections to the DB instance, excluding RDS service connections used for database health checks.

By monitoring these metrics and global status variables you can gain visibility into the connection patterns, errors, and potential bottlenecks that may be causing connectivity issues with your Amazon Aurora MySQL instance.

For example, a high number of `AbortedClients` or `AuroraSlowConnectionHandleCount` can indicate connectivity problems.

Additionally, setting up CloudWatch alarms and notifications can help you proactively identify and respond to connectivity issues before they escalate and impact your application's performance.

## Connectivity error codes for Aurora MySQL
<a name="mysql-dbconn-errors"></a>

The following are some common connectivity errors for Aurora MySQL databases, along with their error codes and explanations.

**Error Code 1040: Too many connections**  
This error occurs when the client tries to establish more connections than the maximum allowed by the database server. Possible causes include the following:  
+ Connection pooling misconfiguration – If using a connection pooling mechanism, ensure that the maximum pool size is not set too high, and that connections are being properly released back to the pool.
+ Database instance configuration – Verify the maximum allowed connections setting for the database instance and adjust it if necessary by setting the `max_connections` parameter.
+ High concurrency – If multiple clients or applications are connecting to the database simultaneously, the maximum allowed connections limit may be reached.

**Error Code 1045: Access denied for user '...'@'...' (using password: YES/NO)**  
This error indicates an authentication failure when attempting to connect to the database. Possible causes include the following:  
+ Authentication plugin compatibility – Check whether the authentication plugin used by the client is compatible with the database server's authentication mechanism.
+ Incorrect username or password – Verify that the correct username and password are being used in the connection string or authentication mechanism.
+ User permissions – Make sure that the user has the necessary permissions to connect to the database instance from the specified host or network.

**Error Code 1049: Unknown database '...'**  
This error indicates that the client is attempting to connect to a database that does not exist on the server. Possible causes include the following:  
+ Database not created – Make sure that the specified database has been created on the database server.
+ Incorrect database name – Double-check the database name used in the connection string or query for accuracy.
+ User permissions – Verify that the user has the necessary permissions to access the specified database.

**Error Code 1153: Got a packet bigger than 'max\$1allowed\$1packet' bytes**  
This error occurs when the client attempts to send or receive data that exceeds the maximum packet size allowed by the database server. Possible causes include the following:  
+ Large queries or result sets – If executing queries that involve large amounts of data, the packet size limit may be exceeded.
+ Misconfigured packet size settings – Check the `max_allowed_packet` setting on the database server and adjust it if necessary.
+ Network configuration issues – Make sure that the network configuration (for example, MTU size) allows for the required packet sizes.

**Error Code 1226: User '...' has exceeded the 'max\$1user\$1connections' resource (current value: ...)**  
This error indicates that the user has exceeded the maximum number of concurrent connections allowed by the database server. Possible causes include the following::  
+ Connection pooling misconfiguration – If using a connection pooling mechanism, ensure that the maximum pool size is not set too high for the user's connection limit.
+ Database instance configuration – Verify the `max_user_connections` setting for the database instance and adjust it if necessary.
+ High concurrency – If multiple clients or applications are connecting to the database simultaneously using the same user, the user-specific connection limit may be reached.

**Error Code 2003: Can't connect to MySQL server on '...' (10061)**  
This error typically occurs when the client is unable to establish a TCP/IP connection with the database server. It can be caused by various issues, such as the following:  
+ Database instance status – Make sure that the database instance is in the `available` state, and not undergoing any maintenance or backup operations.
+ Firewall rules – Check whether any firewalls (operating system, network, or security group) are blocking the connection on the specified port (usually 3306 for MySQL).
+ Incorrect hostname or endpoint – Make sure that the hostname or endpoint used in the connection string is correct and matches the database instance.
+ Network connectivity issues – Verify that the client machine can reach the database instance over the network. Check for any network outages, routing issues, or VPC or subnet misconfigurations.

**Error Code 2005: Unknown MySQL server host '...' (11001)**  
This error occurs when the client is unable to resolve the hostname or endpoint of the database server to an IP address. Possible causes include the following:  
+ DNS resolution issues – Verify that the client machine can resolve the hostname correctly using DNS. Check the DNS settings, DNS cache, and try using the IP address instead of the hostname.
+ Incorrect hostname or endpoint – Double-check the hostname or endpoint used in the connection string for accuracy.
+ Network configuration issues – Make sure that the client's network configuration (for example, VPC, subnet, and route tables) allows DNS resolution and connectivity to the database instance.

**Error Code 2026: SSL connection error**  
This error occurs when there is an issue with the SSL/TLS configuration or certificate validation during the connection attempt. Possible causes include the following:  
+ Certificate expiration – Check whether the SSL/TLS certificate used by the server has expired and needs to be renewed.
+ Certificate validation issues – Verify that the client is able to validate the server's SSL/TLS certificate correctly, and that the certificate is trusted.
+ Network configuration issues – Make sure that the network configuration allows for SSL/TLS connections and doesn't block or interfere with the SSL/TLS handshake process.
+ SSL/TLS configuration mismatch – Make sure that the SSL/TLS settings (for example, cipher suites and protocol versions) on the client and server are compatible.

By understanding the detailed explanations and potential causes for each error code, you can better troubleshoot and resolve connectivity issues when working with Aurora MySQL databases.

## Parameter tuning recommendations for Aurora MySQL
<a name="mysql-dbconn-params"></a>

**Maximum connections**  
Adjusting these parameters can help prevent connection issues caused by reaching the maximum allowed connections limit. Make sure that these values are set appropriately based on your application's concurrency requirements and resource constraints.  
+ `max_connections` – This parameter specifies the maximum number of concurrent connections allowed to the DB instance.
+ `max_user_connections` – This parameter can be specified during user creation and modification, and sets the maximum number of concurrent connections allowed for a specific user account.

**Network buffer size**  
Increasing these values can improve network performance, especially for workloads involving large data transfers or result sets. However, be cautious as larger buffer sizes can consume more memory.  
+ `net_buffer_length` – This parameter sets the initial size for client connection and result buffers, balancing memory usage with query performance.
+ `max_allowed_packet` – This parameter specifies the maximum size of a single network packet that can be sent or received by the DB instance.

**Network compression (client side)**  
Enabling network compression can reduce network bandwidth usage, but it can increase CPU overhead on both the client and server sides.  
+ `compress` – This parameter enables or disables network compression for client/server communication.
+ `compress_protocol` – This parameter specifies the compression protocol to use for network communication.

**Network performance tuning**  
Adjusting these timeouts can help manage idle connections and prevent resource exhaustion, but be cautious as low values can cause premature connection terminations.  
+ `interactive_timeout` – This parameter specifies the number of seconds the server waits for activity on an interactive connection before closing it.
+ `wait_timeout` – This parameter determines the number of seconds the server waits for activity on a noninteractive connection before closing it.

**Network timeout settings**  
Adjusting these timeouts can help address issues related to slow or unresponsive connections. But be careful not to set them too low, as it can cause premature connection failures.  
+ `net_read_timeout` – This parameter specifies the number of seconds to wait for more data from a connection before ending the read operation.
+ `net_write_timeout` – This parameter determines the number of seconds to wait for a block to be written to a connection before ending the write operation.

## Examples of troubleshooting database connection issues for Aurora MySQL
<a name="mysql-dbconn-examples"></a>

The following examples show how to identify and troubleshoot database connection issues for Aurora MySQL.

### Example 1: Troubleshooting failed connection attempts
<a name="mysql-dbconn-example1"></a>

Connection attempts can fail for several reasons, including authentication failures, SSL/TLS handshake failures, `max_connections` limit reached, and resource constraints on the DB instance.

You can track the number of failed connections either from Performance Insights or by using the following command.

```
mysql> show global status like 'aborted_connects';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| Aborted_connects | 7     |
+------------------+-------+
1 row in set (0.00 sec)
```

If the number of `Aborted_connects` increases over time, then the application could be having intermittent connectivity issues.

You can use [Aurora Advanced Auditing](AuroraMySQL.Auditing.md) to log the connects and disconnects from the client connections. You can do this by setting the following parameters in the DB cluster parameter group:
+ `server_audit_logging` = `1`
+ `server_audit_events` = `CONNECT`

 The following is an extract from the audit logs for a failed login.

```
1728498527380921,auora-mysql-node1,user_1,172.31.49.222,147189,0,FAILED_CONNECT,,,1045
1728498527380940,auora-mysql-node1,user_1,172.31.49.222,147189,0,DISCONNECT,,,0
```

Where:
+ `1728498527380921` – The epoch timestamp of when the failed login occurred
+ `aurora-mysql-node1` – The instance identifier of the node of the Aurora MySQL cluster on which the connection failed
+ `user_1` – The name of the database user for which the login failed
+ `172.31.49.222` – The private IP address of the client from which the connection was established
+ `147189` – The connection ID of the failed login
+ `FAILED_CONNECT` – Indicates that the connection failed.
+ `1045` – The return code. A nonzero value indicates an error. In this case, `1045` corresponds to access denied.

For more information, see [Server error codes](https://dev.mysql.com/doc/mysql-errors/5.7/en/server-error-reference.html) and [Client error codes](https://dev.mysql.com/doc/mysql-errors/5.7/en/client-error-reference.html) in the MySQL documentation.

 You can also examine the Aurora MySQL error logs for any related error messages, for example:

```
2024-10-09T19:26:59.310443Z 220 [Note] [MY-010926] [Server] Access denied for user 'user_1'@'172.31.49.222' (using password: YES) (sql_authentication.cc:1502)
```

### Example 2: Troubleshooting abnormal client disconnects
<a name="mysql-dbconn-example2"></a>

You can track the number of abnormal client disconnects either from Performance Insights or by using the following command.

```
mysql> show global status like 'aborted_clients';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| Aborted_clients | 9     |
+-----------------+-------+
1 row in set (0.01 sec)
```

If the number of `Aborted_clients` increases over time, then the application isn't closing the connections to the database correctly. If connections aren't closed properly, it can lead to resource leaks and potential performance issues. Leaving connections open unnecessarily can consume system resources, such as memory and file descriptors, which can eventually cause the application or server to become unresponsive or restart.

You can use the following query to identify accounts that aren't closing connections properly. It retrieves the user account name, the host from which the user is connecting, the number of connections not closed, and the percentage of connections not closed.

```
SELECT
    ess.user,
    ess.host,
    (a.total_connections - a.current_connections) - ess.count_star AS not_closed,
    (((a.total_connections - a.current_connections) - ess.count_star) * 100) / (a.total_connections - a.current_connections) AS pct_not_closed
FROM
    performance_schema.events_statements_summary_by_account_by_event_name AS ess
    JOIN performance_schema.accounts AS a ON (ess.user = a.user AND ess.host = a.host)
WHERE
    ess.event_name = 'statement/com/quit'
    AND (a.total_connections - a.current_connections) > ess.count_star;

+----------+---------------+------------+----------------+
| user     | host          | not_closed | pct_not_closed |
+----------+---------------+------------+----------------+
| user1    | 172.31.49.222 |          1 |        33.3333 |
| user1    | 172.31.93.250 |       1024 |        12.1021 |
| user2    | 172.31.93.250 |         10 |        12.8551 |
+----------+---------------+------------+----------------+
3 rows in set (0.00 sec)
```

After you identify the user accounts and hosts from which the connections aren't closed, you can proceed to check the code that isn't closing the connections gracefully. 

For example, with the MySQL connector in Python, use the `close()` method of the connection object to close connections. Here's an example function that establishes a connection to a database, performs a query, and closes the connection:

```
import mysql.connector

def execute_query(query):
    # Establish a connection to the database
    connection = mysql.connector.connect(
        host="your_host",
        user="your_username",
        password="your_password",
        database="your_database"
    )

    try:
        # Create a cursor object
        cursor = connection.cursor()

        # Execute the query
        cursor.execute(query)

        # Fetch and process the results
        results = cursor.fetchall()
        for row in results:
            print(row)

    finally:
        # Close the cursor and connection
        cursor.close()
        connection.close()
```

In this example, the `connection.close()` method is called in the `finally` block to make sure that the connection is closed, whether or not an exception occurs.

### Example 3: Troubleshooting IAM failed connection attempts
<a name="mysql-dbconn-example3"></a>

Connectivity with AWS Identity and Access Management (IAM) users can fail for several reasons, such as:
+ Incorrect IAM policy configuration
+ Expired security credentials
+ Network connectivity issues
+ Database permission mismatches

To troubleshoot these authentication errors, enable the `iam-db-auth-error` log export feature in your Amazon Relational Database Service (RDS) or Aurora database. This will allow you to view detailed authentication error messages in CloudWatch Log group for your Amazon RDS or Amazon Aurora cluster.

Once enabled, you can review these logs to identify and resolve the specific cause of your IAM authentication failures.

For example:

```
2025-09-22T12:02:30,806 [ERROR] Failed to authorize the connection request for user 'user_1' due to an internal IAM DB Auth error. (Status Code: 500, Error Code: InternalError)
```

and

```
2025-09-22T12:02:51,954 [ERROR] Failed to authenticate the connection request for user 'user_2' because the provided token is malformed or otherwise invalid. (Status Code: 400, Error Code: InvalidToken)
```

For troubleshooting guidance, refer to the [Aurora](UsingWithRDS.IAMDBAuth.Troubleshooting.md) troubleshooting guide for IAM DB authentication.

# Troubleshooting query performance for Aurora MySQL databases
<a name="aurora-mysql-troubleshooting-query"></a>

MySQL provides [query optimizer control](https://dev.mysql.com/doc/refman/8.0/en/controlling-optimizer.html) through system variables that affect how query plans are evaluated, switchable optimizations, optimizer and index hints, and the optimizer cost model. These data points can be helpful not only while comparing different MySQL environments, but also to compare previous query execution plans with current execution plans, and to understand the overall execution of a MySQL query at any point.

Query performance depends on many factors, including the execution plan, table schema and size, statistics, resources, indexes, and parameter configuration. Query tuning requires identifying bottlenecks and optimizing the execution path.
+ Find the execution plan for the query and check whether the query is using appropriate indexes. You can optimize your query by using `EXPLAIN` and reviewing the details of each plan.
+ Aurora MySQL version 3 (compatible with MySQL 8.0 Community Edition) uses an `EXPLAIN ANALYZE` statement. The `EXPLAIN ANALYZE` statement is a profiling tool that shows where MySQL spends time on your query and why. With `EXPLAIN ANALYZE`, Aurora MySQL plans, prepares, and runs the query while counting rows and measuring the time spent at various points of the execution plan. When the query completes, `EXPLAIN ANALYZE` prints the plan and its measurements instead of the query result.
+ Keep your schema statistics updated by using the `ANALYZE` statement. The query optimizer can sometimes choose poor execution plans because of outdated statistics. This can lead to poor performance of a query because of inaccurate cardinality estimates of both tables and indexes. The `last_update` column of the [innodb\$1table\$1stats](https://dev.mysql.com/doc/refman/8.0/en/innodb-persistent-stats.html#innodb-persistent-stats-tables) table shows the last time your schema statistics were updated, which is a good indicator of "staleness."
+ Other issues can occur, such as distribution skew of data, that aren't taken into account for table cardinality. For more information, see [Estimating ANALYZE TABLE complexity for InnoDB tables](https://dev.mysql.com/doc/refman/8.0/en/innodb-analyze-table-complexity.html) and [Histogram statistics in MySQL](https://dev.mysql.com/blog-archive/histogram-statistics-in-mysql/) in the MySQL documentation.

## Understanding the time spent by queries
<a name="ams-query-time"></a>

The following are ways to determine the time spent by queries:
+ [Profiling](https://dev.mysql.com/doc/refman/8.0/en/show-profile.html)
+ [Performance Schema](https://dev.mysql.com/doc/refman/8.0/en/performance-schema.html)
+ [Query optimizer](https://dev.mysql.com/doc/refman/8.0/en/controlling-optimizer.html)

**Profiling**  
By default, profiling is disabled. Enable profiling, then run the slow query and review its profile.  

```
SET profiling = 1;
Run your query.
SHOW PROFILE;
```

1. Identify the stage where the most time is spent. According to [General thread states](https://dev.mysql.com/doc/refman/8.0/en/general-thread-states.html) in the MySQL documentation, reading and processing rows for a `SELECT` statement is often the longest-running state over the lifetime of a given query. You can use the `EXPLAIN` statement to understand how MySQL runs this query.

1. Review the slow query log to evaluate `rows_examined` and `rows_sent` to make sure that the workload is similar in each environment. For more information, see [Logging for Aurora MySQL databases](aurora-mysql-troubleshooting-logging.md).

1. Run the following command for tables that are part of the identified query:

   ```
   SHOW TABLE STATUS\G;
   ```

1. Capture the following outputs before and after running the query on each environment:

   ```
   SHOW GLOBAL STATUS;
   ```

1. Run the following commands on each environment to see if there are any other query/session influencing the performance of this sample query.

   ```
   SHOW FULL PROCESSLIST;
   
   SHOW ENGINE INNODB STATUS\G;
   ```

   Sometimes, when resources on the server are busy, it impacts every other operation on the server, including queries. You can also capture information periodically when queries are run or set up a `cron` job to capture information at useful intervals.

**Performance Schema**  
The Performance Schema provides useful information about server runtime performance, while having minimal impact on that performance. This is different from the `information_schema`, which provides schema information about the DB instance. For more information, see [Overview of the Performance Schema for Performance Insights on Aurora MySQL](USER_PerfInsights.EnableMySQL.md).

**Query optimizer trace**  
To understand why a particular [query plan was chosen for execution](https://dev.mysql.com/doc/refman/8.0/en/execution-plan-information.html), you can set up `optimizer_trace` to access the MySQL query optimizer.  
Run an optimizer trace to show extensive information on all the paths available to the optimizer and its choice.  

```
SET SESSION OPTIMIZER_TRACE="enabled=on"; 
SET optimizer_trace_offset=-5, optimizer_trace_limit=5;

-- Run your query.
SELECT * FROM table WHERE x = 1 AND y = 'A';

-- After the query completes:
SELECT * FROM information_schema.OPTIMIZER_TRACE;
SET SESSION OPTIMIZER_TRACE="enabled=off";
```

## Reviewing query optimizer settings
<a name="ams-query-parameters"></a>

Aurora MySQL version 3 (compatible with MySQL 8.0 Community Edition) has many optimizer-related changes compared with Aurora MySQL version 2 (compatible with MySQL 5.7 Community Edition). If you have some custom values for the `optimizer_switch`, we recommend that you review the differences in the defaults and set `optimizer_switch` values that work best for your workload. We also recommend that you test the options available for Aurora MySQL version 3 to examine how your queries perform.

**Note**  
Aurora MySQL version 3 uses the community default value of 20 for the [innodb\$1stats\$1persistent\$1sample\$1pages](https://dev.mysql.com/doc/refman/8.0/en/innodb-parameters.html#sysvar_innodb_stats_persistent_sample_pages) parameter.

You can use the following command to show the `optimizer_switch` values:

```
SELECT @@optimizer_switch\G;
```

The following table shows the default `optimizer_switch` values for Aurora MySQL versions 2 and 3.


| Setting | Aurora MySQL version 2 | Aurora MySQL version 3 | 
| --- | --- | --- | 
| batched\$1key\$1access | off | off | 
| block\$1nested\$1loop | on | on | 
| condition\$1fanout\$1filter | on | on | 
| derived\$1condition\$1pushdown | – | on | 
| derived\$1merge | on | on | 
| duplicateweedout | on | on | 
| engine\$1condition\$1pushdown | on | on | 
| firstmatch | on | on | 
| hash\$1join | off | on | 
| hash\$1join\$1cost\$1based | on | – | 
| hypergraph\$1optimizer | – | off | 
| index\$1condition\$1pushdown | on | on | 
| index\$1merge | on | on | 
| index\$1merge\$1intersection | on | on | 
| index\$1merge\$1sort\$1union | on | on | 
| index\$1merge\$1union | on | on | 
| loosescan | on | on | 
| materialization | on | on | 
| mrr | on | on | 
| mrr\$1cost\$1based | on | on | 
| prefer\$1ordering\$1index | on | on | 
| semijoin | on | on | 
| skip\$1scan | – | on | 
| subquery\$1materialization\$1cost\$1based | on | on | 
| subquery\$1to\$1derived | – | off | 
| use\$1index\$1extensions | on | on | 
| use\$1invisible\$1indexes | – | off | 

For more information, see [Switchable optimizations (MySQL 5.7)](https://dev.mysql.com/doc/refman/5.7/en/switchable-optimizations.html) and [Switchable optimizations (MySQL 8.0)](https://dev.mysql.com/doc/refman/8.0/en/switchable-optimizations.html) in the MySQL documentation.