# Tuning Aurora MySQL


Wait events and thread states are important tuning tools for Aurora MySQL. If you can find out why sessions are waiting for resources and what they are doing, you are better able to reduce bottlenecks. You can use the information in this section to find possible causes and corrective actions.

Amazon DevOps Guru for RDS can proactively determine whether your Aurora MySQL databases are experiencing problematic conditions that might cause bigger problems later. Amazon DevOps Guru for RDS publishes an explanation and recommendations for corrective actions in a proactive insight. This section contains insights for common problems.

**Important**  
The wait events and thread states in this section are specific to Aurora MySQL. Use the information in this section to tune only Amazon Aurora, not Amazon RDS for MySQL.  
Some wait events in this section have no analogs in the open source versions of these database engines. Other wait events have the same names as events in open source engines, but behave differently. For example, Amazon Aurora storage works different from open source storage, so storage-related wait events indicate different resource conditions.

**Topics**
+ [

# Essential concepts for Aurora MySQL tuning
](AuroraMySQL.Managing.Tuning.concepts.md)
+ [

# Tuning Aurora MySQL with wait events
](AuroraMySQL.Managing.Tuning.wait-events.md)
+ [

# Tuning Aurora MySQL with thread states
](AuroraMySQL.Managing.Tuning.thread-states.md)
+ [

# Tuning Aurora MySQL with Amazon DevOps Guru proactive insights
](MySQL.Tuning.proactive-insights.md)

# Essential concepts for Aurora MySQL tuning


Before you tune your Aurora MySQL database, make sure to learn what wait events and thread states are and why they occur. Also review the basic memory and disk architecture of Aurora MySQL when using the InnoDB storage engine. For a helpful architecture diagram, see the [MySQL Reference Manual](https://dev.mysql.com/doc/refman/8.0/en/innodb-architecture.html).

**Topics**
+ [

## Aurora MySQL wait events
](#AuroraMySQL.Managing.Tuning.concepts.waits)
+ [

## Aurora MySQL thread states
](#AuroraMySQL.Managing.Tuning.concepts.thread-states)
+ [

## Aurora MySQL memory
](#AuroraMySQL.Managing.Tuning.concepts.memory)
+ [

## Aurora MySQL processes
](#AuroraMySQL.Managing.Tuning.concepts.processes)

## Aurora MySQL wait events


A *wait event* indicates a resource for which a session is waiting. For example, the wait event `io/socket/sql/client_connection` indicates that a thread is in the process of handling a new connection. Typical resources that a session waits for include the following:
+ Single-threaded access to a buffer, for example, when a session is attempting to modify a buffer
+ A row that is currently locked by another session
+ A data file read
+ A log file write

For example, to satisfy a query, the session might perform a full table scan. If the data isn't already in memory, the session waits for the disk I/O to complete. When the buffers are read into memory, the session might need to wait because other sessions are accessing the same buffers. The database records the waits by using a predefined wait event. These events are grouped into categories.

A wait event doesn't by itself show a performance problem. For example, if requested data isn't in memory, reading data from disk is necessary. If one session locks a row for an update, another session waits for the row to be unlocked so that it can update it. A commit requires waiting for the write to a log file to complete. Waits are integral to the normal functioning of a database. 

Large numbers of wait events typically show a performance problem. In such cases, you can use wait event data to determine where sessions are spending time. For example, if a report that typically runs in minutes now runs for hours, you can identify the wait events that contribute the most to total wait time. If you can determine the causes of the top wait events, you can sometimes make changes that improve performance. For example, if your session is waiting on a row that has been locked by another session, you can end the locking session. 

## Aurora MySQL thread states


A *general thread state* is a `State` value that is associated with general query processing. For example, the thread state `sending data` indicates that a thread is reading and filtering rows for a query to determine the correct result set. 

You can use thread states to tune Aurora MySQL in a similar fashion to how you use wait events. For example, frequent occurrences of `sending data` usually indicate that a query isn't using an index. For more information about thread states, see [General Thread States](https://dev.mysql.com/doc/refman/5.7/en/general-thread-states.html) in the *MySQL Reference Manual*.

When you use Performance Insights, one of the following conditions is true:
+ Performance Schema is turned on – Aurora MySQL shows wait events rather than the thread state.
+ Performance Schema isn't turned on – Aurora MySQL shows the thread state.

We recommend that you configure the Performance Schema for automatic management. The Performance Schema provides additional insights and better tools to investigate potential performance problems. For more information, see [Overview of the Performance Schema for Performance Insights on Aurora MySQL](USER_PerfInsights.EnableMySQL.md).

## Aurora MySQL memory


In Aurora MySQL, the most important memory areas are the buffer pool and log buffer.

**Topics**
+ [

### Buffer pool
](#AuroraMySQL.Managing.Tuning.concepts.memory.buffer-pool)

### Buffer pool


The *buffer pool* is the shared memory area where Aurora MySQL caches table and index data. Queries can access frequently used data directly from memory without reading from disk.

The buffer pool is structured as a linked list of pages. A *page* can hold multiple rows. Aurora MySQL uses a least recently used (LRU) algorithm to age pages out of the pool.

For more information, see [Buffer Pool](https://dev.mysql.com/doc/refman/8.0/en/innodb-buffer-pool.html) in the *MySQL Reference Manual*.

## Aurora MySQL processes


Aurora MySQL uses a process model that is very different from Aurora PostgreSQL.

**Topics**
+ [

### MySQL server (mysqld)
](#AuroraMySQL.Managing.Tuning.concepts.processes.mysqld)
+ [

### Threads
](#AuroraMySQL.Managing.Tuning.concepts.processes.threads)
+ [

### Thread pool
](#AuroraMySQL.Managing.Tuning.concepts.processes.pool)

### MySQL server (mysqld)


The MySQL server is a single operating-system process named mysqld. The MySQL server doesn't spawn additional processes. Thus, an Aurora MySQL database uses mysqld to perform most of its work.

When the MySQL server starts, it listens for network connections from MySQL clients. When a client connects to the database, mysqld opens a thread.

### Threads


Connection manager threads associate each client connection with a dedicated thread. This thread manages authentication, runs statements, and returns results to the client. Connection manager creates new threads when necessary.

The *thread cache* is the set of available threads. When a connection ends, MySQL returns the thread to the thread cache if the cache isn't full. The `thread_cache_size` system variable determines the thread cache size.

### Thread pool


The *thread pool* consists of a number of thread groups. Each group manages a set of client connections. When a client connects to the database, the thread pool assigns the connections to thread groups in round-robin fashion. The thread pool separates connections and threads. There is no fixed relationship between connections and the threads that run statements received from those connections.

# Tuning Aurora MySQL with wait events


The following table summarizes the Aurora MySQL wait events that most commonly indicate performance problems. The following wait events are a subset of the list in [Aurora MySQL wait events](AuroraMySQL.Reference.Waitevents.md).


| Wait event | Description | 
| --- | --- | 
|  [cpu](ams-waits.cpu.md)  |  This event occurs when a thread is active in CPU or is waiting for CPU.  | 
|  [io/aurora\$1redo\$1log\$1flush](ams-waits.io-auredologflush.md)  |  This event occurs when a session is writing persistent data to Aurora storage.  | 
|  [io/aurora\$1respond\$1to\$1client](ams-waits.respond-to-client.md)  |  This event occurs when a thread is waiting to return a result set to a client.  | 
|  [io/redo\$1log\$1flush](ams-waits.io-redologflush.md)  |  This event occurs when a session is writing persistent data to Aurora storage.  | 
|  [io/socket/sql/client\$1connection](ams-waits.client-connection.md)  |  This event occurs when a thread is in the process of handling a new connection.  | 
|  [io/table/sql/handler](ams-waits.waitio.md)  |  This event occurs when work has been delegated to a storage engine.   | 
|  [synch/cond/innodb/row\$1lock\$1wait](ams-waits.row-lock-wait.md)  |  This event occurs when one session has locked a row for an update, and another session tries to update the same row.  | 
|  [synch/cond/innodb/row\$1lock\$1wait\$1cond](ams-waits.row-lock-wait-cond.md)  |  This event occurs when one session has locked a row for an update, and another session tries to update the same row.  | 
|  [synch/cond/sql/MDL\$1context::COND\$1wait\$1status](ams-waits.cond-wait-status.md)  |  This event occurs when there are threads waiting on a table metadata lock.  | 
|  [synch/mutex/innodb/aurora\$1lock\$1thread\$1slot\$1futex](ams-waits.waitsynch.md)  |  This event occurs when one session has locked a row for an update, and another session tries to update the same row.  | 
|  [synch/mutex/innodb/buf\$1pool\$1mutex](ams-waits.bufpoolmutex.md)  |  This event occurs when a thread has acquired a lock on the InnoDB buffer pool to access a page in memory.  | 
|  [synch/mutex/innodb/fil\$1system\$1mutex](ams-waits.innodb-fil-system-mutex.md)  |  This event occurs when a session is waiting to access the tablespace memory cache.  | 
|  [synch/mutex/innodb/trx\$1sys\$1mutex](ams-waits.trxsysmutex.md)  |  This event occurs when there is high database activity with a large number of transactions.  | 
|  [synch/sxlock/innodb/hash\$1table\$1locks](ams-waits.sx-lock-hash-table-locks.md)  |  This event occurs when pages not found in the buffer pool must be read from a file.  | 
|  [synch/mutex/innodb/temp\$1pool\$1manager\$1mutex](ams-waits.io-temppoolmanager.md)  |  This event occurs when a session is waiting to acquire a mutex for managing the pool of session temporary tablespaces.   | 

# cpu


The `cpu` wait event occurs when a thread is active in CPU or is waiting for CPU.

**Topics**
+ [

## Supported engine versions
](#ams-waits.cpu.context.supported)
+ [

## Context
](#ams-waits.cpu.context)
+ [

## Likely causes of increased waits
](#ams-waits.cpu.causes)
+ [

## Actions
](#ams-waits.cpu.actions)

## Supported engine versions


This wait event information is supported for the following engine versions:
+ Aurora MySQL versions 2 and 3

## Context


For every vCPU, a connection can run work on this CPU. In some situations, the number of active connections that are ready to run is higher than the number of vCPUs. This imbalance results in connections waiting for CPU resources. If the number of active connections stays consistently higher than the number of vCPUs, then your instance experiences CPU contention. The contention causes the `cpu` wait event to occur.

**Note**  
The Performance Insights metric for CPU is `DBLoadCPU`. The value for `DBLoadCPU` can differ from the value for the CloudWatch metric `CPUUtilization`. The latter metric is collected from the HyperVisor for a database instance.

Performance Insights OS metrics provide detailed information about CPU utilization. For example, you can display the following metrics:
+ `os.cpuUtilization.nice.avg`
+ `os.cpuUtilization.total.avg`
+ `os.cpuUtilization.wait.avg`
+ `os.cpuUtilization.idle.avg`

Performance Insights reports the CPU usage by the database engine as `os.cpuUtilization.nice.avg`.

## Likely causes of increased waits


When this event occurs more than normal, possibly indicating a performance problem, typical causes include the following:
+ Analytic queries
+ Highly concurrent transactions
+ Long-running transactions
+ A sudden increase in the number of connections, known as a *login storm*
+ An increase in context switching

## Actions


If the `cpu` wait event dominates database activity, it doesn't necessarily indicate a performance problem. Respond to this event only when performance degrades. 

Depending on the cause of the increase in CPU utilization, consider the following strategies:
+ Increase the CPU capacity of the host. This approach typically gives only temporary relief.
+ Identify top queries for potential optimization.
+ Redirect some read-only workload to reader nodes, if applicable.

**Topics**
+ [

### Identify the sessions or queries that are causing the problem
](#ams-waits.cpu.actions.az-vpc-subnet)
+ [

### Analyze and optimize the high CPU workload
](#ams-waits.cpu.actions.db-instance-class)

### Identify the sessions or queries that are causing the problem


To find the sessions and queries, look at the **Top SQL** table in Performance Insights for the SQL statements that have the highest CPU load. For more information, see [Analyzing metrics with the Performance Insights dashboard](USER_PerfInsights.UsingDashboard.md).

Typically, one or two SQL statements consume the majority of CPU cycles. Concentrate your efforts on these statements. Suppose that your DB instance has 2 vCPUs with a DB load of 3.1 average active sessions (AAS), all in the CPU state. In this case, your instance is CPU bound. Consider the following strategies:
+ Upgrade to a larger instance class with more vCPUs.
+ Tune your queries to have lower CPU load.

In this example, the top SQL queries have a DB load of 1.5 AAS, all in the CPU state. Another SQL statement has a load of 0.1 in the CPU state. In this example, if you stopped the lowest-load SQL statement, you don't significantly reduce database load. However, if you optimize the two high-load queries to be twice as efficient, you eliminate the CPU bottleneck. If you reduce the CPU load of 1.5 AAS by 50 percent, the AAS for each statement decreases to 0.75. The total DB load spent on CPU is now 1.6 AAS. This value is below the maximum vCPU line of 2.0.

For a useful overview of troubleshooting using Performance Insights, see the blog post [Analyze Amazon Aurora MySQL Workloads with Performance Insights](https://aws.amazon.com/blogs/database/analyze-amazon-aurora-mysql-workloads-with-performance-insights/). Also see the AWS Support article [How can I troubleshoot and resolve high CPU utilization on my Amazon RDS for MySQL instances?](https://aws.amazon.com/premiumsupport/knowledge-center/rds-instance-high-cpu/).

### Analyze and optimize the high CPU workload


After you identify the query or queries increasing CPU usage, you can either optimize them or end the connection. The following example shows how to end a connection.

```
CALL mysql.rds_kill(processID);
```

For more information, see [mysql.rds\$1kill](mysql-stored-proc-ending.md#mysql_rds_kill).

If you end a session, the action might trigger a long rollback.

#### Follow the guidelines for optimizing queries


To optimize queries, consider the following guidelines:
+ Run the `EXPLAIN` statement. 

  This command shows the individual steps involved in running a query. For more information, see [Optimizing Queries with EXPLAIN](https://dev.mysql.com/doc/refman/5.7/en/using-explain.html) in the MySQL documentation.
+ Run the `SHOW PROFILE` statement.

  Use this statement to review profile details that can indicate resource usage for statements that are run during the current session. For more information, see [SHOW PROFILE Statement](https://dev.mysql.com/doc/refman/5.7/en/show-profile.html) in the MySQL documentation.
+ Run the `ANALYZE TABLE` statement.

  Use this statement to refresh the index statistics for the tables accessed by the high-CPU consuming query. By analyzing the statement, you can help the optimizer choose an appropriate execution plan. For more information, see [ANALYZE TABLE Statement](https://dev.mysql.com/doc/refman/5.7/en/analyze-table.html) in the MySQL documentation.

#### Follow the guidelines for improving CPU usage


To improve CPU usage in a database instance, follow these guidelines:
+ Ensure that all queries are using proper indexes.
+ Find out whether you can use Aurora parallel queries. You can use this technique to reduce CPU usage on the head node by pushing down function processing, row filtering, and column projection for the `WHERE` clause.
+ Find out whether the number of SQL executions per second meets the expected thresholds.
+ Find out whether index maintenance or new index creation takes up CPU cycles needed by your production workload. Schedule maintenance activities outside of peak activity times.
+ Find out whether you can use partitioning to help reduce the query data set. For more information, see the blog post [How to plan and optimize Amazon Aurora with MySQL compatibility for consolidated workloads](https://aws.amazon.com/blogs/database/planning-and-optimizing-amazon-aurora-with-mysql-compatibility-for-consolidated-workloads/).

#### Check for connection storms


 If the `DBLoadCPU` metric is not very high, but the `CPUUtilization` metric is high, the cause of the high CPU utilization lies outside of the database engine. A classic example is a connection storm.

Check whether the following conditions are true:
+ There is an increase in both the Performance Insights `CPUUtilization` metric and the Amazon CloudWatch `DatabaseConnections` metric.
+ The number of threads in the CPU is greater than the number of vCPUs.

If the preceding conditions are true, consider decreasing the number of database connections. For example, you can use a connection pool such as RDS Proxy. To learn the best practices for effective connection management and scaling, see the whitepaper [Amazon Aurora MySQL DBA Handbook for Connection Management](https://d1.awsstatic.com/whitepapers/RDS/amazon-aurora-mysql-database-administrator-handbook.pdf).

# io/aurora\$1redo\$1log\$1flush


The `io/aurora_redo_log_flush` event occurs when a session is writing persistent data to Amazon Aurora storage.

**Topics**
+ [

## Supported engine versions
](#ams-waits.io-auredologflush.context.supported)
+ [

## Context
](#ams-waits.io-auredologflush.context)
+ [

## Likely causes of increased waits
](#ams-waits.io-auredologflush.causes)
+ [

## Actions
](#ams-waits.io-auredologflush.actions)

## Supported engine versions


This wait event information is supported for the following engine versions:
+ Aurora MySQL version 2

## Context


The `io/aurora_redo_log_flush` event is for a write input/output (I/O) operation in Aurora MySQL.

**Note**  
In Aurora MySQL version 3, this wait event is named [io/redo\$1log\$1flush](ams-waits.io-redologflush.md).

## Likely causes of increased waits


For data persistence, commits require a durable write to stable storage. If the database is doing too many commits, there is a wait event on the write I/O operation, the `io/aurora_redo_log_flush` wait event.

In the following examples, 50,000 records are inserted into an Aurora MySQL DB cluster using the db.r5.xlarge DB instance class:
+ In the first example, each session inserts 10,000 records row by row. By default, if a data manipulation language (DML) command isn't within a transaction, Aurora MySQL uses implicit commits. Autocommit is turned on. This means that for each row insertion there is a commit. Performance Insights shows that the connections spend most of their time waiting on the `io/aurora_redo_log_flush` wait event.   
![\[Performance Insights example of the wait event\]](http://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/images/auredologflush_PI_example1.png)

  This is caused by the simple insert statements used.  
![\[Insert statements in Top SQL\]](http://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/images/auredologflush_top_SQL1.png)

  The 50,000 records take 3.5 minutes to be inserted.
+ In the second example, inserts are made in 1,000 batches, that is each connection performs 10 commits instead of 10,000. Performance Insights shows that the connections don't spend most of their time on the `io/aurora_redo_log_flush` wait event.  
![\[Performance Insights example of the wait event having less impact\]](http://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/images/auredologflush_PI_example2.png)

  The 50,000 records take 4 seconds to be inserted.

## Actions


We recommend different actions depending on the causes of your wait event.

**Topics**
+ [

### Identify the problematic sessions and queries
](#ams-waits.io-auredologflush.actions.identify-queries)
+ [

### Group your write operations
](#ams-waits.io-auredologflush.actions.action0)
+ [

### Turn off autocommit
](#ams-waits.io-auredologflush.actions.action1)
+ [

### Use transactions
](#ams-waits.io-auredologflush.action2)
+ [

### Use batches
](#ams-waits.io-auredologflush.action3)

### Identify the problematic sessions and queries


If your DB instance is experiencing a bottleneck, your first task is to find the sessions and queries that cause it. For a useful AWS Database Blog post, see [Analyze Amazon Aurora MySQL Workloads with Performance Insights](https://aws.amazon.com/blogs/database/analyze-amazon-aurora-mysql-workloads-with-performance-insights/).

**To identify sessions and queries causing a bottleneck**

1. Sign in to the AWS Management Console and open the Amazon RDS console at [https://console.aws.amazon.com/rds/](https://console.aws.amazon.com/rds/).

1. In the navigation pane, choose **Performance Insights**.

1. Choose your DB instance.

1. In **Database load**, choose **Slice by wait**.

1. At the bottom of the page, choose **Top SQL**.

   The queries at the top of the list are causing the highest load on the database.

### Group your write operations


The following examples trigger the `io/aurora_redo_log_flush` wait event. (Autocommit is turned on.)

```
INSERT INTO `sampleDB`.`sampleTable` (sampleCol2, sampleCol3) VALUES ('xxxx','xxxxx');
INSERT INTO `sampleDB`.`sampleTable` (sampleCol2, sampleCol3) VALUES ('xxxx','xxxxx');
INSERT INTO `sampleDB`.`sampleTable` (sampleCol2, sampleCol3) VALUES ('xxxx','xxxxx');
....
INSERT INTO `sampleDB`.`sampleTable` (sampleCol2, sampleCol3) VALUES ('xxxx','xxxxx');

UPDATE `sampleDB`.`sampleTable` SET sampleCol3='xxxxx' WHERE id=xx;
UPDATE `sampleDB`.`sampleTable` SET sampleCol3='xxxxx' WHERE id=xx;
UPDATE `sampleDB`.`sampleTable` SET sampleCol3='xxxxx' WHERE id=xx;
....
UPDATE `sampleDB`.`sampleTable` SET sampleCol3='xxxxx' WHERE id=xx;

DELETE FROM `sampleDB`.`sampleTable` WHERE sampleCol1=xx;
DELETE FROM `sampleDB`.`sampleTable` WHERE sampleCol1=xx;
DELETE FROM `sampleDB`.`sampleTable` WHERE sampleCol1=xx;
....
DELETE FROM `sampleDB`.`sampleTable` WHERE sampleCol1=xx;
```

To reduce the time spent waiting on the `io/aurora_redo_log_flush` wait event, group your write operations logically into a single commit to reduce persistent calls to storage.

### Turn off autocommit


Turn off autocommit before making large changes that aren't within a transaction, as shown in the following example.

```
SET SESSION AUTOCOMMIT=OFF;
UPDATE `sampleDB`.`sampleTable` SET sampleCol3='xxxxx' WHERE sampleCol1=xx;
UPDATE `sampleDB`.`sampleTable` SET sampleCol3='xxxxx' WHERE sampleCol1=xx;
UPDATE `sampleDB`.`sampleTable` SET sampleCol3='xxxxx' WHERE sampleCol1=xx;
....
UPDATE `sampleDB`.`sampleTable` SET sampleCol3='xxxxx' WHERE sampleCol1=xx;
-- Other DML statements here
COMMIT;

SET SESSION AUTOCOMMIT=ON;
```

### Use transactions


You can use transactions, as shown in the following example.

```
BEGIN
INSERT INTO `sampleDB`.`sampleTable` (sampleCol2, sampleCol3) VALUES ('xxxx','xxxxx');
INSERT INTO `sampleDB`.`sampleTable` (sampleCol2, sampleCol3) VALUES ('xxxx','xxxxx');
INSERT INTO `sampleDB`.`sampleTable` (sampleCol2, sampleCol3) VALUES ('xxxx','xxxxx');
....
INSERT INTO `sampleDB`.`sampleTable` (sampleCol2, sampleCol3) VALUES ('xxxx','xxxxx');

DELETE FROM `sampleDB`.`sampleTable` WHERE sampleCol1=xx;
DELETE FROM `sampleDB`.`sampleTable` WHERE sampleCol1=xx;
DELETE FROM `sampleDB`.`sampleTable` WHERE sampleCol1=xx;
....
DELETE FROM `sampleDB`.`sampleTable` WHERE sampleCol1=xx;

-- Other DML statements here
END
```

### Use batches


You can make changes in batches, as shown in the following example. However, using batches that are too large can cause performance issues, especially in read replicas or when doing point-in-time recovery (PITR). 

```
INSERT INTO `sampleDB`.`sampleTable` (sampleCol2, sampleCol3) VALUES
('xxxx','xxxxx'),('xxxx','xxxxx'),...,('xxxx','xxxxx'),('xxxx','xxxxx');

UPDATE `sampleDB`.`sampleTable` SET sampleCol3='xxxxx' WHERE sampleCol1 BETWEEN xx AND xxx;

DELETE FROM `sampleDB`.`sampleTable` WHERE sampleCol1<xx;
```

# io/aurora\$1respond\$1to\$1client


The `io/aurora_respond_to_client` event occurs when a thread is waiting to return a result set to a client.

**Topics**
+ [

## Supported engine versions
](#ams-waits.respond-to-client.context.supported)
+ [

## Context
](#ams-waits.respond-to-client.context)
+ [

## Likely causes of increased waits
](#ams-waits.respond-to-client.causes)
+ [

## Actions
](#ams-waits.respond-to-client.actions)

## Supported engine versions


This wait event information is supported for the following engine versions:
+ Aurora MySQL version 2

## Context


The event `io/aurora_respond_to_client` indicates that a thread is waiting to return a result set to a client.

The query processing is complete, and the results are being returned back to the application client. However, because there isn't enough network bandwidth on the DB cluster, a thread is waiting to return the result set.

## Likely causes of increased waits


When the `io/aurora_respond_to_client` event appears more than normal, possibly indicating a performance problem, typical causes include the following:

**DB instance class insufficient for the workload**  
The DB instance class used by the DB cluster doesn't have the necessary network bandwidth to process the workload efficiently.

**Large result sets**  
There was an increase in size of the result set being returned, because the query returns higher numbers of rows. The larger result set consumes more network bandwidth.

**Increased load on the client**  
There might be CPU pressure, memory pressure, or network saturation on the client. An increase in load on the client delays the reception of data from the Aurora MySQL DB cluster.

**Increased network latency**  
There might be increased network latency between the Aurora MySQL DB cluster and client. Higher network latency increases the time required for the client to receive the data.

## Actions


We recommend different actions depending on the causes of your wait event.

**Topics**
+ [

### Identify the sessions and queries causing the events
](#ams-waits.respond-to-client.actions.identify)
+ [

### Scale the DB instance class
](#ams-waits.respond-to-client.actions.scale-db-instance-class)
+ [

### Check workload for unexpected results
](#ams-waits.respond-to-client.actions.workload)
+ [

### Distribute workload with reader instances
](#ams-waits.respond-to-client.actions.balance)
+ [

### Use the SQL\$1BUFFER\$1RESULT modifier
](#ams-waits.respond-to-client.actions.sql-buffer-result)

### Identify the sessions and queries causing the events


You can use Performance Insights to show queries blocked by the `io/aurora_respond_to_client` wait event. Typically, databases with moderate to significant load have wait events. The wait events might be acceptable if performance is optimal. If performance isn't optimal, then examine where the database is spending the most time. Look at the wait events that contribute to the highest load, and find out whether you can optimize the database and application to reduce those events. 

**To find SQL queries that are responsible for high load**

1. Sign in to the AWS Management Console and open the Amazon RDS console at [https://console.aws.amazon.com/rds/](https://console.aws.amazon.com/rds/).

1. In the navigation pane, choose **Performance Insights**.

1. Choose a DB instance. The Performance Insights dashboard is shown for that DB instance.

1. In the **Database load** chart, choose **Slice by wait**.

1. At the bottom of the page, choose **Top SQL**.

   The chart lists the SQL queries that are responsible for the load. Those at the top of the list are most responsible. To resolve a bottleneck, focus on these statements.

For a useful overview of troubleshooting using Performance Insights, see the AWS Database Blog post [Analyze Amazon Aurora MySQL Workloads with Performance Insights](https://aws.amazon.com/blogs/database/analyze-amazon-aurora-mysql-workloads-with-performance-insights/).

### Scale the DB instance class


Check for the increase in the value of the Amazon CloudWatch metrics related to network throughput, such as `NetworkReceiveThroughput` and `NetworkTransmitThroughput`. If the DB instance class network bandwidth is being reached, you can scale the DB instance class used by the DB cluster by modifying the DB cluster. A DB instance class with larger network bandwidth returns data to clients more efficiently.

For information about monitoring Amazon CloudWatch metrics, see [Viewing metrics in the Amazon RDS console](USER_Monitoring.md). For information about DB instance classes, see [Amazon AuroraDB instance classes](Concepts.DBInstanceClass.md). For information about modifying a DB cluster, see [Modifying an Amazon Aurora DB cluster](Aurora.Modifying.md).

### Check workload for unexpected results


Check the workload on the DB cluster and make sure that it isn't producing unexpected results. For example, there might be queries that are returning a higher number of rows than expected. In this case, you can use Performance Insights counter metrics such as `Innodb_rows_read`. For more information, see [Performance Insights counter metrics](USER_PerfInsights_Counters.md).

### Distribute workload with reader instances


You can distribute read-only workload with Aurora replicas. You can scale horizontally by adding more Aurora replicas. Doing so can result in an increase in the throttling limits for network bandwidth. For more information, see [Amazon Aurora DB clusters](Aurora.Overview.md).

### Use the SQL\$1BUFFER\$1RESULT modifier


You can add the `SQL_BUFFER_RESULT` modifier to `SELECT` statements to force the result into a temporary table before they are returned to the client. This modifier can help with performance issues when InnoDB locks aren't being freed because queries are in the `io/aurora_respond_to_client` wait state. For more information, see [SELECT Statement](https://dev.mysql.com/doc/refman/5.7/en/select.html) in the MySQL documentation.

# io/redo\$1log\$1flush


The `io/redo_log_flush` event occurs when a session is writing persistent data to Amazon Aurora storage.

**Topics**
+ [

## Supported engine versions
](#ams-waits.io-redologflush.context.supported)
+ [

## Context
](#ams-waits.io-redologflush.context)
+ [

## Likely causes of increased waits
](#ams-waits.io-redologflush.causes)
+ [

## Actions
](#ams-waits.io-redologflush.actions)

## Supported engine versions


This wait event information is supported for the following engine versions:
+ Aurora MySQL version 3

## Context


The `io/redo_log_flush` event is for a write input/output (I/O) operation in Aurora MySQL.

**Note**  
In Aurora MySQL version 2, this wait event is named [io/aurora\$1redo\$1log\$1flush](ams-waits.io-auredologflush.md).

## Likely causes of increased waits


For data persistence, commits require a durable write to stable storage. If the database is doing too many commits, there is a wait event on the write I/O operation, the `io/redo_log_flush` wait event.

For examples of the behavior of this wait event, see [io/aurora\$1redo\$1log\$1flush](ams-waits.io-auredologflush.md).

## Actions


We recommend different actions depending on the causes of your wait event.

**Topics**
+ [

### Identify the problematic sessions and queries
](#ams-waits.io-redologflush.actions.identify-queries)
+ [

### Group your write operations
](#ams-waits.io-redologflush.actions.action0)
+ [

### Turn off autocommit
](#ams-waits.io-redologflush.actions.action1)
+ [

### Use transactions
](#ams-waits.io-redologflush.action2)
+ [

### Use batches
](#ams-waits.io-redologflush.action3)

### Identify the problematic sessions and queries


If your DB instance is experiencing a bottleneck, your first task is to find the sessions and queries that cause it. For a useful AWS Database Blog post, see [Analyze Amazon Aurora MySQL Workloads with Performance Insights](https://aws.amazon.com/blogs/database/analyze-amazon-aurora-mysql-workloads-with-performance-insights/).

**To identify sessions and queries causing a bottleneck**

1. Sign in to the AWS Management Console and open the Amazon RDS console at [https://console.aws.amazon.com/rds/](https://console.aws.amazon.com/rds/).

1. In the navigation pane, choose **Performance Insights**.

1. Choose your DB instance.

1. In **Database load**, choose **Slice by wait**.

1. At the bottom of the page, choose **Top SQL**.

   The queries at the top of the list are causing the highest load on the database.

### Group your write operations


The following examples trigger the `io/redo_log_flush` wait event. (Autocommit is turned on.)

```
INSERT INTO `sampleDB`.`sampleTable` (sampleCol2, sampleCol3) VALUES ('xxxx','xxxxx');
INSERT INTO `sampleDB`.`sampleTable` (sampleCol2, sampleCol3) VALUES ('xxxx','xxxxx');
INSERT INTO `sampleDB`.`sampleTable` (sampleCol2, sampleCol3) VALUES ('xxxx','xxxxx');
....
INSERT INTO `sampleDB`.`sampleTable` (sampleCol2, sampleCol3) VALUES ('xxxx','xxxxx');

UPDATE `sampleDB`.`sampleTable` SET sampleCol3='xxxxx' WHERE id=xx;
UPDATE `sampleDB`.`sampleTable` SET sampleCol3='xxxxx' WHERE id=xx;
UPDATE `sampleDB`.`sampleTable` SET sampleCol3='xxxxx' WHERE id=xx;
....
UPDATE `sampleDB`.`sampleTable` SET sampleCol3='xxxxx' WHERE id=xx;

DELETE FROM `sampleDB`.`sampleTable` WHERE sampleCol1=xx;
DELETE FROM `sampleDB`.`sampleTable` WHERE sampleCol1=xx;
DELETE FROM `sampleDB`.`sampleTable` WHERE sampleCol1=xx;
....
DELETE FROM `sampleDB`.`sampleTable` WHERE sampleCol1=xx;
```

To reduce the time spent waiting on the `io/redo_log_flush` wait event, group your write operations logically into a single commit to reduce persistent calls to storage.

### Turn off autocommit


Turn off autocommit before making large changes that aren't within a transaction, as shown in the following example.

```
SET SESSION AUTOCOMMIT=OFF;
UPDATE `sampleDB`.`sampleTable` SET sampleCol3='xxxxx' WHERE sampleCol1=xx;
UPDATE `sampleDB`.`sampleTable` SET sampleCol3='xxxxx' WHERE sampleCol1=xx;
UPDATE `sampleDB`.`sampleTable` SET sampleCol3='xxxxx' WHERE sampleCol1=xx;
....
UPDATE `sampleDB`.`sampleTable` SET sampleCol3='xxxxx' WHERE sampleCol1=xx;
-- Other DML statements here
COMMIT;

SET SESSION AUTOCOMMIT=ON;
```

### Use transactions


You can use transactions, as shown in the following example.

```
BEGIN
INSERT INTO `sampleDB`.`sampleTable` (sampleCol2, sampleCol3) VALUES ('xxxx','xxxxx');
INSERT INTO `sampleDB`.`sampleTable` (sampleCol2, sampleCol3) VALUES ('xxxx','xxxxx');
INSERT INTO `sampleDB`.`sampleTable` (sampleCol2, sampleCol3) VALUES ('xxxx','xxxxx');
....
INSERT INTO `sampleDB`.`sampleTable` (sampleCol2, sampleCol3) VALUES ('xxxx','xxxxx');

DELETE FROM `sampleDB`.`sampleTable` WHERE sampleCol1=xx;
DELETE FROM `sampleDB`.`sampleTable` WHERE sampleCol1=xx;
DELETE FROM `sampleDB`.`sampleTable` WHERE sampleCol1=xx;
....
DELETE FROM `sampleDB`.`sampleTable` WHERE sampleCol1=xx;

-- Other DML statements here
END
```

### Use batches


You can make changes in batches, as shown in the following example. However, using batches that are too large can cause performance issues, especially in read replicas or when doing point-in-time recovery (PITR).

```
INSERT INTO `sampleDB`.`sampleTable` (sampleCol2, sampleCol3) VALUES
('xxxx','xxxxx'),('xxxx','xxxxx'),...,('xxxx','xxxxx'),('xxxx','xxxxx');

UPDATE `sampleDB`.`sampleTable` SET sampleCol3='xxxxx' WHERE sampleCol1 BETWEEN xx AND xxx;

DELETE FROM `sampleDB`.`sampleTable` WHERE sampleCol1<xx;
```

# io/socket/sql/client\$1connection


The `io/socket/sql/client_connection` event occurs when a thread is in the process of handling a new connection.

**Topics**
+ [

## Supported engine versions
](#ams-waits.client-connection.context.supported)
+ [

## Context
](#ams-waits.client-connection.context)
+ [

## Likely causes of increased waits
](#ams-waits.client-connection.causes)
+ [

## Actions
](#ams-waits.client-connection.actions)

## Supported engine versions


This wait event information is supported for the following engine versions:
+ Aurora MySQL versions 2 and 3

## Context


The event `io/socket/sql/client_connection` indicates that mysqld is busy creating threads to handle incoming new client connections. In this scenario, the processing of servicing new client connection requests slows down while connections wait for the thread to be assigned. For more information, see [MySQL server (mysqld)](AuroraMySQL.Managing.Tuning.concepts.md#AuroraMySQL.Managing.Tuning.concepts.processes.mysqld).

## Likely causes of increased waits


When this event appears more than normal, possibly indicating a performance problem, typical causes include the following:
+ There is a sudden increase in new user connections from the application to your Amazon RDS instance.
+ Your DB instance can't process new connections because the network, CPU, or memory is being throttled.

## Actions


If `io/socket/sql/client_connection` dominates database activity, it doesn't necessarily indicate a performance problem. In a database that isn't idle, a wait event is always on top. Act only when performance degrades. We recommend different actions depending on the causes of your wait event.

**Topics**
+ [

### Identify the problematic sessions and queries
](#ams-waits.client-connection.actions.identify-queries)
+ [

### Follow best practices for connection management
](#ams-waits.client-connection.actions.manage-connections)
+ [

### Scale up your instance if resources are being throttled
](#ams-waits.client-connection.upgrade)
+ [

### Check the top hosts and top users
](#ams-waits.client-connection.top-hosts)
+ [

### Query the performance\$1schema tables
](#ams-waits.client-connection.perf-schema)
+ [

### Check the thread states of your queries
](#ams-waits.client-connection.thread-states)
+ [

### Audit your requests and queries
](#ams-waits.client-connection.auditing)
+ [

### Pool your database connections
](#ams-waits.client-connection.pooling)

### Identify the problematic sessions and queries


If your DB instance is experiencing a bottleneck, your first task is to find the sessions and queries that cause it. For a useful blog post, see [Analyze Amazon Aurora MySQL Workloads with Performance Insights](https://aws.amazon.com/blogs/database/analyze-amazon-aurora-mysql-workloads-with-performance-insights/).

**To identify sessions and queries causing a bottleneck**

1. Sign in to the AWS Management Console and open the Amazon RDS console at [https://console.aws.amazon.com/rds/](https://console.aws.amazon.com/rds/).

1. In the navigation pane, choose **Performance Insights**.

1. Choose your DB instance.

1. In **Database load**, choose **Slice by wait**.

1. At the bottom of the page, choose **Top SQL**.

   The queries at the top of the list are causing the highest load on the database.

### Follow best practices for connection management


To manage your connections, consider the following strategies:
+ Use connection pooling.

  You can gradually increase the number of connections as required. For more information, see the whitepaper [Amazon Aurora MySQL Database Administrator’s Handbook](https://d1.awsstatic.com/whitepapers/RDS/amazon-aurora-mysql-database-administrator-handbook.pdf).
+ Use a reader node to redistribute read-only traffic.

  For more information, see [Aurora Replicas](Aurora.Replication.md#Aurora.Replication.Replicas) and [Amazon Aurora endpoint connections](Aurora.Overview.Endpoints.md).

### Scale up your instance if resources are being throttled


Look for examples of throttling in the following resources:
+ CPU

  Check your Amazon CloudWatch metrics for high CPU usage.
+ Network

  Check for an increase in the value of the CloudWatch metrics `network receive throughput` and `network transmit throughput`. If your instance has reached the network bandwidth limit for your instance class, consider scaling up your RDS instance to a higher instance class type. For more information, see [Amazon AuroraDB instance classes](Concepts.DBInstanceClass.md).
+ Freeable memory 

  Check for a drop in the CloudWatch metric `FreeableMemory`. Also, consider turning on Enhanced Monitoring. For more information, see [Monitoring OS metrics with Enhanced Monitoring](USER_Monitoring.OS.md).

### Check the top hosts and top users


Use Performance Insights to check the top hosts and top users. For more information, see [Analyzing metrics with the Performance Insights dashboard](USER_PerfInsights.UsingDashboard.md).

### Query the performance\$1schema tables


To get an accurate count of the current and total connections, query the `performance_schema` tables. With this technique, you identify the source user or host that is responsible for creating a high number of connections. For example, query the `performance_schema` tables as follows.

```
SELECT * FROM performance_schema.accounts;
SELECT * FROM performance_schema.users;
SELECT * FROM performance_schema.hosts;
```

### Check the thread states of your queries


If your performance issue is ongoing, check the thread states of your queries. In the `mysql` client, issue the following command.

```
show processlist;
```

### Audit your requests and queries


To check the nature of the requests and queries from user accounts, use AuroraAurora MySQL Advanced Auditing. To learn how to turn on auditing, see [Using Advanced Auditing with an Amazon Aurora MySQL DB cluster](AuroraMySQL.Auditing.md).

### Pool your database connections


Consider using Amazon RDS Proxy for connection management. By using RDS Proxy, you can allow your applications to pool and share database connections to improve their ability to scale. RDS Proxy makes applications more resilient to database failures by automatically connecting to a standby DB instance while preserving application connections. For more information, see [Amazon RDS Proxyfor Aurora](rds-proxy.md).

# io/table/sql/handler


The `io/table/sql/handler` event occurs when work has been delegated to a storage engine.

**Topics**
+ [

## Supported engine versions
](#ams-waits.waitio.context.supported)
+ [

## Context
](#ams-waits.waitio.context)
+ [

## Likely causes of increased waits
](#ams-waits.waitio.causes)
+ [

## Actions
](#ams-waits.waitio.actions)

## Supported engine versions


This wait event information is supported for the following engine versions:
+ Aurora MySQL versions 2 and 3

## Context


The event `io/table` indicates a wait for access to a table. This event occurs regardless of whether the data is cached in the buffer pool or accessed on disk. The `io/table/sql/handler` event indicates an increase in workload activity. 

A *handler* is a routine specialized in a certain type of data or focused on certain special tasks. For example, an event handler receives and digests events and signals from the operating system or from a user interface. A memory handler performs tasks related to memory. A file input handler is a function that receives file input and performs special tasks on the data, according to context.

Views such as `performance_schema.events_waits_current` often show `io/table/sql/handler` when the actual wait is a nested wait event such as a lock. When the actual wait isn't `io/table/sql/handler`, Performance Insights reports the nested wait event. When Performance Insights reports `io/table/sql/handler`, it represents InnoDB processing of the I/O request and not a hidden nested wait event. For more information, see [Performance Schema Atom and Molecule Events](https://dev.mysql.com/doc/refman/5.7/en/performance-schema-atom-molecule-events.html) in the *MySQL Reference Manual*.

The `io/table/sql/handler` event often appears in top wait events with I/O waits such as `io/aurora_redo_log_flush`.

## Likely causes of increased waits


In Performance Insights, sudden spikes in the `io/table/sql/handler` event indicate an increase in workload activity. Increased activity means increased I/O. 

Performance Insights filters the nesting event IDs and doesn't report a `io/table/sql/handler` wait when the underlying nested event is a lock wait. For example, if the root cause event is [synch/mutex/innodb/aurora\$1lock\$1thread\$1slot\$1futex](ams-waits.waitsynch.md), Performance Insights displays this wait in top wait events and not `io/table/sql/handler`.

In views such as `performance_schema.events_waits_current`, waits for `io/table/sql/handler` often appear when the actual wait is a nested wait event such as a lock. When the actual wait differs from `io/table/sql/handler`, Performance Insights looks up the nested wait and reports the actual wait instead of `io/table/sql/handler`. When Performance Insights reports `io/table/sql/handler`, the real wait is `io/table/sql/handler` and not a hidden nested wait event. For more information, see [Performance Schema Atom and Molecule Events](https://dev.mysql.com/doc/refman/5.7/en/performance-schema-atom-molecule-events.html) in the *MySQL 5.7 Reference Manual*.

## Actions


If this wait event dominates database activity, it doesn't necessarily indicate a performance problem. A wait event is always on top when the database is active. You need to act only when performance degrades.

We recommend different actions depending on the other wait events that you see.

**Topics**
+ [

### Identify the sessions and queries causing the events
](#ams-waits.waitio.actions.identify)
+ [

### Check for a correlation with Performance Insights counter metrics
](#ams-waits.waitio.actions.filters)
+ [

### Check for other correlated wait events
](#ams-waits.waitio.actions.maintenance)

### Identify the sessions and queries causing the events


Typically, databases with moderate to significant load have wait events. The wait events might be acceptable if performance is optimal. If performance is isn't optimal, then examine where the database is spending the most time. Look at the wait events that contribute to the highest load, and find out whether you can optimize the database and application to reduce those events.

**To find SQL queries that are responsible for high load**

1. Sign in to the AWS Management Console and open the Amazon RDS console at [https://console.aws.amazon.com/rds/](https://console.aws.amazon.com/rds/).

1. In the navigation pane, choose **Performance Insights**.

1. Choose a DB instance. The Performance Insights dashboard is shown for that DB instance.

1. In the **Database load** chart, choose **Slice by wait**.

1. At the bottom of the page, choose **Top SQL**.

   The chart lists the SQL queries that are responsible for the load. Those at the top of the list are most responsible. To resolve a bottleneck, focus on these statements.

For a useful overview of troubleshooting using Performance Insights, see the blog post [Analyze Amazon Aurora MySQL Workloads with Performance Insights](https://aws.amazon.com/blogs/database/analyze-amazon-aurora-mysql-workloads-with-performance-insights/).

### Check for a correlation with Performance Insights counter metrics


Check for Performance Insights counter metrics such as `Innodb_rows_changed`. If counter metrics are correlated with `io/table/sql/handler`, follow these steps:

1. In Performance Insights, look for the SQL statements accounting for the `io/table/sql/handler` top wait event. If possible, optimize this statement so that it returns fewer rows.

1. Retrieve the top tables from the `schema_table_statistics` and `x$schema_table_statistics` views. These views show the amount of time spent per table. For more information, see [The schema\$1table\$1statistics and x\$1schema\$1table\$1statistics Views](https://dev.mysql.com/doc/refman/5.7/en/sys-schema-table-statistics.html) in the *MySQL Reference Manual*.

   By default, rows are sorted by descending total wait time. Tables with the most contention appear first. The output indicates whether time is spent on reads, writes, fetches, inserts, updates, or deletes.

   ```
   mysql> select * from sys.schema_table_statistics limit 1\G
   
   *************************** 1. row ***************************
        table_schema: read_only_db
          table_name: sbtest41
       total_latency: 54.11 m
        rows_fetched: 6001557
       fetch_latency: 39.14 m
       rows_inserted: 14833
      insert_latency: 5.78 m
        rows_updated: 30470
      update_latency: 5.39 m
        rows_deleted: 14833
      delete_latency: 3.81 m
    io_read_requests: NULL
             io_read: NULL
     io_read_latency: NULL
   io_write_requests: NULL
            io_write: NULL
    io_write_latency: NULL
    io_misc_requests: NULL
     io_misc_latency: NULL
   1 row in set (0.11 sec)
   ```

### Check for other correlated wait events


If `synch/sxlock/innodb/btr_search_latch` and `io/table/sql/handler` contribute most to the DB load anomaly together, check whether the `innodb_adaptive_hash_index` variable is turned on. If it is, consider increasing the `innodb_adaptive_hash_index_parts` parameter value.

If the Adaptive Hash Index is turned off, consider turning it on. To learn more about the MySQL Adaptive Hash Index, see the following resources:
+ The article [Is Adaptive Hash Index in InnoDB right for my workload?](https://www.percona.com/blog/2016/04/12/is-adaptive-hash-index-in-innodb-right-for-my-workload) on the Percona website
+ [Adaptive Hash Index](https://dev.mysql.com/doc/refman/5.7/en/innodb-adaptive-hash.html) in the *MySQL Reference Manual*
+ The article [Contention in MySQL InnoDB: Useful Info From the Semaphores Section](https://www.percona.com/blog/2019/12/20/contention-in-mysql-innodb-useful-info-from-the-semaphores-section/) on the Percona website

**Note**  
The Adaptive Hash Index isn't supported on Aurora reader DB instances.  
In some cases, performance might be poor on a reader instance when `synch/sxlock/innodb/btr_search_latch` and `io/table/sql/handler` are dominant. If so, consider redirecting the workload temporarily to the writer DB instance and turning on the Adaptive Hash Index.

# synch/cond/innodb/row\$1lock\$1wait


The `synch/cond/innodb/row_lock_wait` event occurs when one session has locked a row for an update, and another session tries to update the same row. For more information, see [InnoDB locking](https://dev.mysql.com/doc/refman/8.0/en/innodb-locking.html) in the MySQL documentation.


## Supported engine versions


This wait event information is supported for the following engine versions:
+ Aurora MySQL version 3

## Likely causes of increased waits


Multiple data manipulation language (DML) statements are accessing the same row or rows simultaneously.

## Actions


We recommend different actions depending on the other wait events that you see.

**Topics**
+ [

### Find and respond to the SQL statements responsible for this wait event
](#ams-waits.row-lock-wait.actions.id)
+ [

### Find and respond to the blocking session
](#ams-waits.row-lock-wait.actions.blocker)

### Find and respond to the SQL statements responsible for this wait event


Use Performance Insights to identify the SQL statements responsible for this wait event. Consider the following strategies:
+ If row locks are a persistent problem, consider rewriting the application to use optimistic locking.
+ Use multirow statements.
+ Spread the workload over different database objects. You can do this through partitioning.
+ Check the value of the `innodb_lock_wait_timeout` parameter. It controls how long transactions wait before generating a timeout error.

For a useful overview of troubleshooting using Performance Insights, see the blog post [Analyze Amazon Aurora MySQL Workloads with Performance Insights](https://aws.amazon.com/blogs/database/analyze-amazon-aurora-mysql-workloads-with-performance-insights/).

### Find and respond to the blocking session


Determine whether the blocking session is idle or active. Also, find out whether the session comes from an application or an active user.

To identify the session holding the lock, you can run `SHOW ENGINE INNODB STATUS`. The following example shows sample output.

```
mysql> SHOW ENGINE INNODB STATUS;

---TRANSACTION 1688153, ACTIVE 82 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 4244, OS thread handle 70369524330224, query id 4020834 172.31.14.179 reinvent executing
select id1 from test.t1 where id1=1 for update
------- TRX HAS BEEN WAITING 24 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 11 page no 4 n bits 72 index GEN_CLUST_INDEX of table test.t1 trx id 1688153 lock_mode X waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
```

Or you can use the following query to extract details on current locks.

```
mysql> SELECT p1.id waiting_thread,
    p1.user waiting_user,
    p1.host waiting_host,
    it1.trx_query waiting_query,
    ilw.requesting_engine_transaction_id waiting_transaction,
    ilw.blocking_engine_lock_id blocking_lock,
    il.lock_mode blocking_mode,
    il.lock_type blocking_type,
    ilw.blocking_engine_transaction_id blocking_transaction,
    CASE it.trx_state
        WHEN 'LOCK WAIT'
        THEN it.trx_state
        ELSE p.state end blocker_state,
    concat(il.object_schema,'.', il.object_name) as locked_table,
    it.trx_mysql_thread_id blocker_thread,
    p.user blocker_user,
    p.host blocker_host
FROM performance_schema.data_lock_waits ilw
JOIN performance_schema.data_locks il
ON ilw.blocking_engine_lock_id = il.engine_lock_id
AND ilw.blocking_engine_transaction_id = il.engine_transaction_id
JOIN information_schema.innodb_trx it
ON ilw.blocking_engine_transaction_id = it.trx_id join information_schema.processlist p
ON it.trx_mysql_thread_id = p.id join information_schema.innodb_trx it1
ON ilw.requesting_engine_transaction_id = it1.trx_id join information_schema.processlist p1
ON it1.trx_mysql_thread_id = p1.id\G

*************************** 1. row ***************************
waiting_thread: 4244
waiting_user: reinvent
waiting_host: 123.456.789.012:18158
waiting_query: select id1 from test.t1 where id1=1 for update
waiting_transaction: 1688153
blocking_lock: 70369562074216:11:4:2:70369549808672
blocking_mode: X
blocking_type: RECORD
blocking_transaction: 1688142
blocker_state: User sleep
locked_table: test.t1
blocker_thread: 4243
blocker_user: reinvent
blocker_host: 123.456.789.012:18156
1 row in set (0.00 sec)
```

When you identify the session, your options include the following:
+ Contact the application owner or the user.
+ If the blocking session is idle, consider ending the blocking session. This action might trigger a long rollback. To learn how to end a session, see [Ending a session or query](mysql-stored-proc-ending.md).

For more information about identifying blocking transactions, see [Using InnoDB transaction and locking information](https://dev.mysql.com/doc/refman/8.0/en/innodb-information-schema-examples.html) in the MySQL documentation.

# synch/cond/innodb/row\$1lock\$1wait\$1cond


The `synch/cond/innodb/row_lock_wait_cond` event occurs when one session has locked a row for an update, and another session tries to update the same row. For more information, see [InnoDB locking](https://dev.mysql.com/doc/refman/5.7/en/innodb-locking.html) in the MySQL documentation.


## Supported engine versions


This wait event information is supported for the following engine versions:
+ Aurora MySQL version 2

## Likely causes of increased waits


Multiple data manipulation language (DML) statements are accessing the same row or rows simultaneously.

## Actions


We recommend different actions depending on the other wait events that you see.

**Topics**
+ [

### Find and respond to the SQL statements responsible for this wait event
](#ams-waits.row-lock-wait-cond.actions.id)
+ [

### Find and respond to the blocking session
](#ams-waits.row-lock-wait-cond.actions.blocker)

### Find and respond to the SQL statements responsible for this wait event


Use Performance Insights to identify the SQL statements responsible for this wait event. Consider the following strategies:
+ If row locks are a persistent problem, consider rewriting the application to use optimistic locking.
+ Use multirow statements.
+ Spread the workload over different database objects. You can do this through partitioning.
+ Check the value of the `innodb_lock_wait_timeout` parameter. It controls how long transactions wait before generating a timeout error.

For a useful overview of troubleshooting using Performance Insights, see the blog post [Analyze Amazon Aurora MySQL Workloads with Performance Insights](https://aws.amazon.com/blogs/database/analyze-amazon-aurora-mysql-workloads-with-performance-insights/).

### Find and respond to the blocking session


Determine whether the blocking session is idle or active. Also, find out whether the session comes from an application or an active user.

To identify the session holding the lock, you can run `SHOW ENGINE INNODB STATUS`. The following example shows sample output.

```
mysql> SHOW ENGINE INNODB STATUS;

---TRANSACTION 2771110, ACTIVE 112 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)
MySQL thread id 24, OS thread handle 70369573642160, query id 13271336 172.31.14.179 reinvent Sending data
select id1 from test.t1 where id1=1 for update
------- TRX HAS BEEN WAITING 43 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 11 page no 3 n bits 0 index GEN_CLUST_INDEX of table test.t1 trx id 2771110 lock_mode X waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
```

Or you can use the following query to extract details on current locks.

```
mysql> SELECT p1.id waiting_thread,
              p1.user waiting_user,
              p1.host waiting_host,
              it1.trx_query waiting_query,        
              ilw.requesting_trx_id waiting_transaction, 
              ilw.blocking_lock_id blocking_lock, 
              il.lock_mode blocking_mode,
              il.lock_type blocking_type,
              ilw.blocking_trx_id blocking_transaction,
              CASE it.trx_state 
                WHEN 'LOCK WAIT' 
                THEN it.trx_state 
                ELSE p.state 
              END blocker_state, 
              il.lock_table locked_table,        
              it.trx_mysql_thread_id blocker_thread, 
              p.user blocker_user, 
              p.host blocker_host 
       FROM information_schema.innodb_lock_waits ilw 
       JOIN information_schema.innodb_locks il 
         ON ilw.blocking_lock_id = il.lock_id 
        AND ilw.blocking_trx_id = il.lock_trx_id
       JOIN information_schema.innodb_trx it 
         ON ilw.blocking_trx_id = it.trx_id
       JOIN information_schema.processlist p 
         ON it.trx_mysql_thread_id = p.id 
       JOIN information_schema.innodb_trx it1 
         ON ilw.requesting_trx_id = it1.trx_id 
       JOIN information_schema.processlist p1 
         ON it1.trx_mysql_thread_id = p1.id\G

*************************** 1. row ***************************
      waiting_thread: 3561959471
        waiting_user: reinvent
        waiting_host: 123.456.789.012:20485
       waiting_query: select id1 from test.t1 where id1=1 for update
 waiting_transaction: 312337314
       blocking_lock: 312337287:261:3:2
       blocking_mode: X
       blocking_type: RECORD
blocking_transaction: 312337287
       blocker_state: User sleep
        locked_table: `test`.`t1`
      blocker_thread: 3561223876
        blocker_user: reinvent
        blocker_host: 123.456.789.012:17746
1 row in set (0.04 sec)
```

When you identify the session, your options include the following:
+ Contact the application owner or the user.
+ If the blocking session is idle, consider ending the blocking session. This action might trigger a long rollback. To learn how to end a session, see [Ending a session or query](mysql-stored-proc-ending.md).

For more information about identifying blocking transactions, see [Using InnoDB transaction and locking information](https://dev.mysql.com/doc/refman/5.7/en/innodb-information-schema-examples.html) in the MySQL documentation.

# synch/cond/sql/MDL\$1context::COND\$1wait\$1status


The `synch/cond/sql/MDL_context::COND_wait_status` event occurs when there are threads waiting on a table metadata lock.

**Topics**
+ [

## Supported engine versions
](#ams-waits.cond-wait-status.context.supported)
+ [

## Context
](#ams-waits.cond-wait-status.context)
+ [

## Likely causes of increased waits
](#ams-waits.cond-wait-status.causes)
+ [

## Actions
](#ams-waits.cond-wait-status.actions)

## Supported engine versions


This wait event information is supported for the following engine versions:
+ Aurora MySQL versions 2 and 3

## Context


The event `synch/cond/sql/MDL_context::COND_wait_status` indicates that there are threads waiting on a table metadata lock. In some cases, one session holds a metadata lock on a table and another session tries to get the same lock on the same table. In such a case, the second session waits on the `synch/cond/sql/MDL_context::COND_wait_status` wait event.

MySQL uses metadata locking to manage concurrent access to database objects and to ensure data consistency. Metadata locking applies to tables, schemas, scheduled events, tablespaces, and user locks acquired with the `get_lock` function, and stored programs. Stored programs include procedures, functions, and triggers. For more information, see [Metadata locking](https://dev.mysql.com/doc/refman/5.7/en/metadata-locking.html) in the MySQL documentation.

The MySQL process list shows this session in the state `waiting for metadata lock`. In Performance Insights, if `Performance_schema` is turned on, the event `synch/cond/sql/MDL_context::COND_wait_status` appears.

The default timeout for a query waiting on a metadata lock is based on the value of the `lock_wait_timeout` parameter, which defaults to 31,536,000 seconds (365 days).

For more details on different InnoDB locks and the types of locks that can cause conflicts, see [InnoDB Locking](https://dev.mysql.com/doc/refman/5.7/en/innodb-locking.html) in the MySQL documentation.

## Likely causes of increased waits


When the `synch/cond/sql/MDL_context::COND_wait_status` event appears more than normal, possibly indicating a performance problem, typical causes include the following:

**Long-running transactions**  
One or more transactions are modifying a large amount of data and holding locks on tables for a very long time.

**Idle transactions**  
One or more transactions remain open for a long time, without being committed or rolled back.

**DDL statements on large tables**  
One or more data definition statements (DDL) statements, such as `ALTER TABLE` commands, were run on very large tables.

**Explicit table locks**  
There are explicit locks on tables that aren't being released in a timely manner. For example, an application might run `LOCK TABLE` statements improperly.

## Actions


We recommend different actions depending on the causes of your wait event and on the version of the Aurora MySQL DB cluster.

**Topics**
+ [

### Identify the sessions and queries causing the events
](#ams-waits.cond-wait-status.actions.identify)
+ [

### Check for past events
](#ams-waits.cond-wait-status.actions.past-events)
+ [

### Run queries on Aurora MySQL version 2
](#ams-waits.cond-wait-status.actions.run-queries-aurora-mysql-57)
+ [

### Respond to the blocking session
](#ams-waits.cond-wait-status.actions.blocker)

### Identify the sessions and queries causing the events


You can use Performance Insights to show queries blocked by the `synch/cond/sql/MDL_context::COND_wait_status` wait event. However, to identify the blocking session, query metadata tables from `performance_schema` and `information_schema` on the DB cluster.

Typically, databases with moderate to significant load have wait events. The wait events might be acceptable if performance is optimal. If performance isn't optimal, then examine where the database is spending the most time. Look at the wait events that contribute to the highest load, and find out whether you can optimize the database and application to reduce those events.

**To find SQL queries that are responsible for high load**

1. Sign in to the AWS Management Console and open the Amazon RDS console at [https://console.aws.amazon.com/rds/](https://console.aws.amazon.com/rds/).

1. In the navigation pane, choose **Performance Insights**.

1. Choose a DB instance. The Performance Insights dashboard for that DB instance appears.

1. In the **Database load** chart, choose **Slice by wait**.

1. At the bottom of the page, choose **Top SQL**.

   The chart lists the SQL queries that are responsible for the load. Those at the top of the list are most responsible. To resolve a bottleneck, focus on these statements.

For a useful overview of troubleshooting using Performance Insights, see the AWS Database Blog post [Analyze Amazon Aurora MySQL Workloads with Performance Insights](https://aws.amazon.com/blogs/database/analyze-amazon-aurora-mysql-workloads-with-performance-insights/).

### Check for past events


You can gain insight into this wait event to check for past occurrences of it. To do so, complete the following actions:
+ Check the data manipulation language (DML) and DDL throughput and latency to see if there were any changes in workload.

  You can use Performance Insights to find queries waiting on this event at the time of the issue. Also, you can view the digest of the queries run near the time of issue.
+ If audit logs or general logs are turned on for the DB cluster, you can check for all queries run on the objects (schema.table) involved in the waiting transaction. You can also check for the queries that completed running before the transaction.

The information available to troubleshoot past events is limited. Performing these checks doesn't show which object is waiting for information. However, you can identify tables with heavy load at the time of the event and the set of frequently operated rows causing conflict at the time of issue. You can then use this information to reproduce the issue in a test environment and provide insights about its cause.

### Run queries on Aurora MySQL version 2


In Aurora MySQL version 2, you can identify the blocked session directly by querying `performance_schema` tables or `sys` schema views. An example can illustrate how to query tables to identify blocking queries and sessions.

In the following process list output, the connection ID `89` is waiting on a metadata lock, and it's running a `TRUNCATE TABLE` command. In a query on the `performance_schema` tables or `sys` schema views, the output shows that the blocking session is `76`.

```
MySQL [(none)]> select @@version, @@aurora_version;
+-----------+------------------+
| @@version | @@aurora_version |
+-----------+------------------+
| 5.7.12    | 2.11.5           |
+-----------+------------------+
1 row in set (0.01 sec)

MySQL [(none)]> show processlist;
+----+-----------------+--------------------+-----------+---------+------+---------------------------------+-------------------------------+
| Id | User            | Host               | db        | Command | Time | State                           | Info                          |
+----+-----------------+--------------------+-----------+---------+------+---------------------------------+-------------------------------+
|  2 | rdsadmin        | localhost          | NULL      | Sleep   |    0 | NULL                            | NULL                          |
|  4 | rdsadmin        | localhost          | NULL      | Sleep   |    2 | NULL                            | NULL                          |
|  5 | rdsadmin        | localhost          | NULL      | Sleep   |    1 | NULL                            | NULL                          |
| 20 | rdsadmin        | localhost          | NULL      | Sleep   |    0 | NULL                            | NULL                          |
| 21 | rdsadmin        | localhost          | NULL      | Sleep   |  261 | NULL                            | NULL                          |
| 66 | auroramysql5712 | 172.31.21.51:52154 | sbtest123 | Sleep   |    0 | NULL                            | NULL                          |
| 67 | auroramysql5712 | 172.31.21.51:52158 | sbtest123 | Sleep   |    0 | NULL                            | NULL                          |
| 68 | auroramysql5712 | 172.31.21.51:52150 | sbtest123 | Sleep   |    0 | NULL                            | NULL                          |
| 69 | auroramysql5712 | 172.31.21.51:52162 | sbtest123 | Sleep   |    0 | NULL                            | NULL                          |
| 70 | auroramysql5712 | 172.31.21.51:52160 | sbtest123 | Sleep   |    0 | NULL                            | NULL                          |
| 71 | auroramysql5712 | 172.31.21.51:52152 | sbtest123 | Sleep   |    0 | NULL                            | NULL                          |
| 72 | auroramysql5712 | 172.31.21.51:52156 | sbtest123 | Sleep   |    0 | NULL                            | NULL                          |
| 73 | auroramysql5712 | 172.31.21.51:52164 | sbtest123 | Sleep   |    0 | NULL                            | NULL                          |
| 74 | auroramysql5712 | 172.31.21.51:52166 | sbtest123 | Sleep   |    0 | NULL                            | NULL                          |
| 75 | auroramysql5712 | 172.31.21.51:52168 | sbtest123 | Sleep   |    0 | NULL                            | NULL                          |
| 76 | auroramysql5712 | 172.31.21.51:52170 | NULL      | Query   |    0 | starting                        | show processlist              |
| 88 | auroramysql5712 | 172.31.21.51:52194 | NULL      | Query   |   22 | User sleep                      | select sleep(10000)           |
| 89 | auroramysql5712 | 172.31.21.51:52196 | NULL      | Query   |    5 | Waiting for table metadata lock | truncate table sbtest.sbtest1 |
+----+-----------------+--------------------+-----------+---------+------+---------------------------------+-------------------------------+
18 rows in set (0.00 sec)
```

Next, a query on the `performance_schema` tables or `sys` schema views shows that the blocking session is `76`.

```
MySQL [(none)]> select * from sys.schema_table_lock_waits;                                                                
+---------------+-------------+-------------------+-------------+------------------------------+-------------------+-----------------------+-------------------------------+--------------------+-----------------------------+-----------------------------+--------------------+--------------+------------------------------+--------------------+------------------------+-------------------------+------------------------------+
| object_schema | object_name | waiting_thread_id | waiting_pid | waiting_account              | waiting_lock_type | waiting_lock_duration | waiting_query                 | waiting_query_secs | waiting_query_rows_affected | waiting_query_rows_examined | blocking_thread_id | blocking_pid | blocking_account             | blocking_lock_type | blocking_lock_duration | sql_kill_blocking_query | sql_kill_blocking_connection |
+---------------+-------------+-------------------+-------------+------------------------------+-------------------+-----------------------+-------------------------------+--------------------+-----------------------------+-----------------------------+--------------------+--------------+------------------------------+--------------------+------------------------+-------------------------+------------------------------+
| sbtest        | sbtest1     |               121 |          89 | auroramysql5712@192.0.2.0    | EXCLUSIVE         | TRANSACTION           | truncate table sbtest.sbtest1 |                 10 |                           0 |                           0 |                108 |           76 | auroramysql5712@192.0.2.0    | SHARED_READ        | TRANSACTION            | KILL QUERY 76           | KILL 76                      |
+---------------+-------------+-------------------+-------------+------------------------------+-------------------+-----------------------+-------------------------------+--------------------+-----------------------------+-----------------------------+--------------------+--------------+------------------------------+--------------------+------------------------+-------------------------+------------------------------+
1 row in set (0.00 sec)
```

### Respond to the blocking session


When you identify the session, your options include the following:
+ Contact the application owner or the user.
+ If the blocking session is idle, consider ending the blocking session. This action might trigger a long rollback. To learn how to end a session, see [Ending a session or query](mysql-stored-proc-ending.md).

For more information about identifying blocking transactions, see [Using InnoDB Transaction and Locking Information](https://dev.mysql.com/doc/refman/5.7/en/innodb-information-schema-examples.html) in the MySQL documentation.

# synch/mutex/innodb/aurora\$1lock\$1thread\$1slot\$1futex


The `synch/mutex/innodb/aurora_lock_thread_slot_futex` event occurs when one session has locked a row for an update, and another session tries to update the same row. For more information, see [InnoDB locking](https://dev.mysql.com/doc/refman/5.7/en/innodb-locking.html) in the *MySQL Reference*.


## Supported engine versions


This wait event information is supported for the following engine versions:
+ Aurora MySQL version 2

## Likely causes of increased waits


Multiple data manipulation language (DML) statements are accessing the same row or rows simultaneously.

## Actions


We recommend different actions depending on the other wait events that you see.

**Topics**
+ [

### Find and respond to the SQL statements responsible for this wait event
](#ams-waits.waitsynch.actions.id)
+ [

### Find and respond to the blocking session
](#ams-waits.waitsynch.actions.blocker)

### Find and respond to the SQL statements responsible for this wait event


Use Performance Insights to identify the SQL statements responsible for this wait event. Consider the following strategies:
+ If row locks are a persistent problem, consider rewriting the application to use optimistic locking.
+ Use multirow statements.
+ Spread the workload over different database objects. You can do this through partitioning.
+ Check the value of the `innodb_lock_wait_timeout` parameter. It controls how long transactions wait before generating a timeout error.

For a useful overview of troubleshooting using Performance Insights, see the blog post [Analyze Amazon Aurora MySQL Workloads with Performance Insights](https://aws.amazon.com/blogs/database/analyze-amazon-aurora-mysql-workloads-with-performance-insights/).

### Find and respond to the blocking session


Determine whether the blocking session is idle or active. Also, find out whether the session comes from an application or an active user.

To identify the session holding the lock, you can run `SHOW ENGINE INNODB STATUS`. The following example shows sample output.

```
mysql> SHOW ENGINE INNODB STATUS;

---------------------TRANSACTION 302631452, ACTIVE 2 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 376, 1 row lock(s)
MySQL thread id 80109, OS thread handle 0x2ae915060700, query id 938819 10.0.4.12 reinvent updating
UPDATE sbtest1 SET k=k+1 WHERE id=503
------- TRX HAS BEEN WAITING 2 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 148 page no 11 n bits 30 index `PRIMARY` of table `sysbench2`.`sbtest1` trx id 302631452 lock_mode X locks rec but not gap waiting
Record lock, heap no 30 PHYSICAL RECORD: n_fields 6; compact format; info bits 0
```

Or you can use the following query to extract details on current locks.

```
mysql> SELECT p1.id waiting_thread,
              p1.user waiting_user,
              p1.host waiting_host,
              it1.trx_query waiting_query,        
              ilw.requesting_trx_id waiting_transaction, 
              ilw.blocking_lock_id blocking_lock, 
              il.lock_mode blocking_mode,
              il.lock_type blocking_type,
              ilw.blocking_trx_id blocking_transaction,
              CASE it.trx_state 
                WHEN 'LOCK WAIT' 
                THEN it.trx_state 
                ELSE p.state 
              END blocker_state, 
              il.lock_table locked_table,        
              it.trx_mysql_thread_id blocker_thread, 
              p.user blocker_user, 
              p.host blocker_host 
       FROM information_schema.innodb_lock_waits ilw 
       JOIN information_schema.innodb_locks il 
         ON ilw.blocking_lock_id = il.lock_id 
        AND ilw.blocking_trx_id = il.lock_trx_id
       JOIN information_schema.innodb_trx it 
         ON ilw.blocking_trx_id = it.trx_id
       JOIN information_schema.processlist p 
         ON it.trx_mysql_thread_id = p.id 
       JOIN information_schema.innodb_trx it1 
         ON ilw.requesting_trx_id = it1.trx_id 
       JOIN information_schema.processlist p1 
         ON it1.trx_mysql_thread_id = p1.id\G

*************************** 1. row ***************************
      waiting_thread: 3561959471
        waiting_user: reinvent
        waiting_host: 123.456.789.012:20485
       waiting_query: select id1 from test.t1 where id1=1 for update
 waiting_transaction: 312337314
       blocking_lock: 312337287:261:3:2
       blocking_mode: X
       blocking_type: RECORD
blocking_transaction: 312337287
       blocker_state: User sleep
        locked_table: `test`.`t1`
      blocker_thread: 3561223876
        blocker_user: reinvent
        blocker_host: 123.456.789.012:17746
1 row in set (0.04 sec)
```

When you identify the session, your options include the following:
+ Contact the application owner or the user.
+ If the blocking session is idle, consider ending the blocking session. This action might trigger a long rollback. To learn how to end a session, see [Ending a session or query](mysql-stored-proc-ending.md).

For more information about identifying blocking transactions, see [Using InnoDB Transaction and Locking Information](https://dev.mysql.com/doc/refman/5.7/en/innodb-information-schema-examples.html) in the *MySQL Reference Manual*.

# synch/mutex/innodb/buf\$1pool\$1mutex


The `synch/mutex/innodb/buf_pool_mutex` event occurs when a thread has acquired a lock on the InnoDB buffer pool to access a page in memory.

**Topics**
+ [

## Relevant engine versions
](#ams-waits.bufpoolmutex.context.supported)
+ [

## Context
](#ams-waits.bufpoolmutex.context)
+ [

## Likely causes of increased waits
](#ams-waits.bufpoolmutex.causes)
+ [

## Actions
](#ams-waits.bufpoolmutex.actions)

## Relevant engine versions


This wait event information is supported for the following engine versions:
+ Aurora MySQL version 2

## Context


The `buf_pool` mutex is a single mutex that protects the control data structures of the buffer pool.

For more information, see [Monitoring InnoDB Mutex Waits Using Performance Schema](https://dev.mysql.com/doc/refman/5.7/en/monitor-innodb-mutex-waits-performance-schema.html) in the MySQL documentation.

## Likely causes of increased waits


This is a workload-specific wait event. Common causes for `synch/mutex/innodb/buf_pool_mutex` to appear among the top wait events include the following:
+ The buffer pool size isn't large enough to hold the working set of data.
+ The workload is more specific to certain pages from a specific table in the database, leading to contention in the buffer pool.

## Actions


We recommend different actions depending on the causes of your wait event.

**Topics**
+ [

### Identify the sessions and queries causing the events
](#ams-waits.bufpoolmutex.actions.identify)
+ [

### Use Performance Insights
](#ams-waits.bufpoolmutex.actions.action1)
+ [

### Create Aurora Replicas
](#ams-waits.bufpoolmutex.actions.action2)
+ [

### Examine the buffer pool size
](#ams-waits.bufpoolmutex.actions.action3)
+ [

### Monitor the global status history
](#ams-waits.bufpoolmutex.actions.action4)

### Identify the sessions and queries causing the events


Typically, databases with moderate to significant load have wait events. The wait events might be acceptable if performance is optimal. If performance isn't optimal, then examine where the database is spending the most time. Look at the wait events that contribute to the highest load, and find out whether you can optimize the database and application to reduce those events.

**To view the Top SQL chart in the AWS Management Console**

1. Open the Amazon RDS console at [https://console.aws.amazon.com/rds/](https://console.aws.amazon.com/rds/).

1. In the navigation pane, choose **Performance Insights**.

1. Choose a DB instance. The Performance Insights dashboard is shown for that DB instance.

1. In the **Database load** chart, choose **Slice by wait**.

1. Underneath the **Database load** chart, choose **Top SQL**.

   The chart lists the SQL queries that are responsible for the load. Those at the top of the list are most responsible. To resolve a bottleneck, focus on these statements.

For a useful overview of troubleshooting using Performance Insights, see the blog post [Analyze Amazon Aurora MySQL Workloads with Performance Insights](https://aws.amazon.com/blogs/database/analyze-amazon-aurora-mysql-workloads-with-performance-insights/).

### Use Performance Insights


This event is related to workload. You can use Performance Insights to do the following:
+ Identify when wait events start, and whether there's any change in the workload around that time from the application logs or related sources.
+ Identify the SQL statements responsible for this wait event. Examine the execution plan of the queries to make sure that these queries are optimized and using appropriate indexes.

  If the top queries responsible for the wait event are related to the same database object or table, then consider partitioning that object or table.

### Create Aurora Replicas


You can create Aurora Replicas to serve read-only traffic. You can also use Aurora Auto Scaling to handle surges in read traffic. Make sure to run scheduled read-only tasks and logical backups on Aurora Replicas.

For more information, see [Amazon Aurora Auto Scaling with Aurora Replicas](Aurora.Integrating.AutoScaling.md).

### Examine the buffer pool size


Check whether the buffer pool size is sufficient for the workload by looking at the metric `innodb_buffer_pool_wait_free`. If the value of this metric is high and increasing continuously, that indicates that the size of the buffer pool isn't sufficient to handle the workload. If `innodb_buffer_pool_size` has been set properly, the value of `innodb_buffer_pool_wait_free` should be small. For more information, see [Innodb\$1buffer\$1pool\$1wait\$1free](https://dev.mysql.com/doc/refman/5.7/en/server-status-variables.html#statvar_Innodb_buffer_pool_wait_free) in the MySQL documentation.

Increase the buffer pool size if the DB instance has enough memory for session buffers and operating-system tasks. If it doesn't, change the DB instance to a larger DB instance class to get additional memory that can be allocated to the buffer pool.

**Note**  
Aurora MySQL automatically adjusts the value of `innodb_buffer_pool_instances` based on the configured `innodb_buffer_pool_size`.

### Monitor the global status history


By monitoring the change rates of status variables, you can detect locking or memory issues on your DB instance. Turn on Global Status History (GoSH) if it isn't already turned on. For more information on GoSH, see [Managing the global status history](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.MySQL.CommonDBATasks.html#Appendix.MySQL.CommonDBATasks.GoSH).

You can also create custom Amazon CloudWatch metrics to monitor status variables. For more information, see [Publishing custom metrics](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/publishingMetrics.html).

# synch/mutex/innodb/fil\$1system\$1mutex


The `synch/mutex/innodb/fil_system_mutex` event occurs when a session is waiting to access the tablespace memory cache.

**Topics**
+ [

## Supported engine versions
](#ams-waits.innodb-fil-system-mutex.context.supported)
+ [

## Context
](#ams-waits.innodb-fil-system-mutex.context)
+ [

## Likely causes of increased waits
](#ams-waits.innodb-fil-system-mutex.causes)
+ [

## Actions
](#ams-waits.innodb-fil-system-mutex.actions)

## Supported engine versions


This wait event information is supported for the following engine versions:
+ Aurora MySQL versions 2 and 3

## Context


InnoDB uses tablespaces to manage the storage area for tables and log files. The *tablespace memory cache* is a global memory structure that maintains information about tablespaces. MySQL uses `synch/mutex/innodb/fil_system_mutex` waits to control concurrent access to the tablespace memory cache. 

The event `synch/mutex/innodb/fil_system_mutex` indicates that there is currently more than one operation that needs to retrieve and manipulate information in the tablespace memory cache for the same tablespace.

## Likely causes of increased waits


When the `synch/mutex/innodb/fil_system_mutex` event appears more than normal, possibly indicating a performance problem, this typically occurs when all of the following conditions are present:
+ An increase in concurrent data manipulation language (DML) operations that update or delete data in the same table.
+ The tablespace for this table is very large and has a lot of data pages.
+ The fill factor for these data pages is low.

## Actions


We recommend different actions depending on the causes of your wait event.

**Topics**
+ [

### Identify the sessions and queries causing the events
](#ams-waits.innodb-fil-system-mutex.actions.identify)
+ [

### Reorganize large tables during off-peak hours
](#ams-waits.innodb-fil-system-mutex.actions.reorganize)

### Identify the sessions and queries causing the events


Typically, databases with moderate to significant load have wait events. The wait events might be acceptable if performance is optimal. If performance isn't optimal, examine where the database is spending the most time. Look at the wait events that contribute to the highest load, and find out whether you can optimize the database and application to reduce those events.

**To find SQL queries that are responsible for high load**

1. Sign in to the AWS Management Console and open the Amazon RDS console at [https://console.aws.amazon.com/rds/](https://console.aws.amazon.com/rds/).

1. In the navigation pane, choose **Performance Insights**.

1. Choose a DB instance. The Performance Insights dashboard appears for that DB instance.

1. In the **Database load** chart, choose **Slice by wait**.

1. At the bottom of the page, choose **Top SQL**.

   The chart lists the SQL queries that are responsible for the load. Those at the top of the list are most responsible. To resolve a bottleneck, focus on these statements.

For a useful overview of troubleshooting using Performance Insights, see the blog post [Analyze Amazon Aurora MySQL Workloads with Performance Insights](https://aws.amazon.com/blogs/database/analyze-amazon-aurora-mysql-workloads-with-performance-insights/).

Another way to find out which queries are causing high numbers of `synch/mutex/innodb/fil_system_mutex` waits is to check `performance_schema`, as in the following example.

```
mysql> select * from performance_schema.events_waits_current where EVENT_NAME='wait/synch/mutex/innodb/fil_system_mutex'\G
*************************** 1. row ***************************
            THREAD_ID: 19
             EVENT_ID: 195057
         END_EVENT_ID: 195057
           EVENT_NAME: wait/synch/mutex/innodb/fil_system_mutex
               SOURCE: fil0fil.cc:6700
          TIMER_START: 1010146190118400
            TIMER_END: 1010146196524000
           TIMER_WAIT: 6405600
                SPINS: NULL
        OBJECT_SCHEMA: NULL
          OBJECT_NAME: NULL
           INDEX_NAME: NULL
          OBJECT_TYPE: NULL
OBJECT_INSTANCE_BEGIN: 47285552262176
     NESTING_EVENT_ID: NULL
   NESTING_EVENT_TYPE: NULL
            OPERATION: lock
      NUMBER_OF_BYTES: NULL
                FLAGS: NULL
*************************** 2. row ***************************
            THREAD_ID: 23
             EVENT_ID: 5480
         END_EVENT_ID: 5480
           EVENT_NAME: wait/synch/mutex/innodb/fil_system_mutex
               SOURCE: fil0fil.cc:5906
          TIMER_START: 995269979908800
            TIMER_END: 995269980159200
           TIMER_WAIT: 250400
                SPINS: NULL
        OBJECT_SCHEMA: NULL
          OBJECT_NAME: NULL
           INDEX_NAME: NULL
          OBJECT_TYPE: NULL
OBJECT_INSTANCE_BEGIN: 47285552262176
     NESTING_EVENT_ID: NULL
   NESTING_EVENT_TYPE: NULL
            OPERATION: lock
      NUMBER_OF_BYTES: NULL
                FLAGS: NULL
*************************** 3. row ***************************
            THREAD_ID: 55
             EVENT_ID: 23233794
         END_EVENT_ID: NULL
           EVENT_NAME: wait/synch/mutex/innodb/fil_system_mutex
               SOURCE: fil0fil.cc:449
          TIMER_START: 1010492125341600
            TIMER_END: 1010494304900000
           TIMER_WAIT: 2179558400
                SPINS: NULL
        OBJECT_SCHEMA: NULL
          OBJECT_NAME: NULL
           INDEX_NAME: NULL
          OBJECT_TYPE: NULL
OBJECT_INSTANCE_BEGIN: 47285552262176
     NESTING_EVENT_ID: 23233786
   NESTING_EVENT_TYPE: WAIT
            OPERATION: lock
      NUMBER_OF_BYTES: NULL
                FLAGS: NULL
```

### Reorganize large tables during off-peak hours


Reorganize large tables that you identify as the source of high numbers of `synch/mutex/innodb/fil_system_mutex` wait events during a maintenance window outside of production hours. Doing so ensures that the internal tablespaces map cleanup doesn't occur when quick access to the table is critical. For information about reorganizing tables, see [OPTIMIZE TABLE Statement](https://dev.mysql.com/doc/refman/5.7/en/optimize-table.html) in the *MySQL Reference*.

# synch/mutex/innodb/trx\$1sys\$1mutex


The `synch/mutex/innodb/trx_sys_mutex` event occurs when there is high database activity with a large number of transactions.

**Topics**
+ [

## Relevant engine versions
](#ams-waits.trxsysmutex.context.supported)
+ [

## Context
](#ams-waits.trxsysmutex.context)
+ [

## Likely causes of increased waits
](#ams-waits.trxsysmutex.causes)
+ [

## Actions
](#ams-waits.trxsysmutex.actions)

## Relevant engine versions


This wait event information is supported for the following engine versions:
+ Aurora MySQL versions 2 and 3

## Context


Internally, the InnoDB database engine uses the repeatable read isolation level with snapshots to provide read consistency. This gives you a point-in-time view of the database at the time the snapshot was created.

In InnoDB, all changes are applied to the database as soon as they arrive, regardless of whether they're committed. This approach means that without multiversion concurrency control (MVCC), all users connected to the database see all of the changes and the latest rows. Therefore, InnoDB requires a way to track the changes to understand what to roll back when necessary.

To do this, InnoDB uses a transaction system (`trx_sys`) to track snapshots. The transaction system does the following:
+ Tracks the transaction ID for each row in the undo logs.
+ Uses an internal InnoDB structure called `ReadView` that helps to identify which transaction IDs are visible for a snapshot.

## Likely causes of increased waits


Any database operation that requires the consistent and controlled handling (creating, reading, updating, and deleting) of transactions IDs generates a call from `trx_sys` to the mutex.

These calls happen inside three functions:
+ `trx_sys_mutex_enter` – Creates the mutex.
+ `trx_sys_mutex_exit` – Releases the mutex.
+ `trx_sys_mutex_own` – Tests whether the mutex is owned.

The InnoDB Performance Schema instrumentation tracks all `trx_sys` mutex calls. Tracking includes, but isn't limited to, management of `trx_sys` on database startup or shutdown, rollback operations, undo cleanups, row read access, and buffer pool loads. High database activity with a large number of transactions results in `synch/mutex/innodb/trx_sys_mutex` appearing among the top wait events.

For more information, see [Monitoring InnoDB Mutex Waits Using Performance Schema](https://dev.mysql.com/doc/refman/5.7/en/monitor-innodb-mutex-waits-performance-schema.html) in the MySQL documentation.

## Actions


We recommend different actions depending on the causes of your wait event.

**Topics**
+ [

### Identify the sessions and queries causing the events
](#ams-waits.trxsysmutex.actions.identify)
+ [

### Examine other wait events
](#ams-waits.trxsysmutex.actions.action1)

### Identify the sessions and queries causing the events


Typically, databases with moderate to significant load have wait events. The wait events might be acceptable if performance is optimal. If performance isn't optimal, then examine where the database is spending the most time. Look at the wait events that contribute to the highest load. Find out whether you can optimize the database and application to reduce those events.

**To view the Top SQL chart in the AWS Management Console**

1. Open the Amazon RDS console at [https://console.aws.amazon.com/rds/](https://console.aws.amazon.com/rds/).

1. In the navigation pane, choose **Performance Insights**.

1. Choose a DB instance. The Performance Insights dashboard is shown for that DB instance.

1. In the **Database load** chart, choose **Slice by wait**.

1. Under the **Database load** chart, choose **Top SQL**.

   The chart lists the SQL queries that are responsible for the load. Those at the top of the list are most responsible. To resolve a bottleneck, focus on these statements.

For a useful overview of troubleshooting using Performance Insights, see the blog post [Analyze Amazon Aurora MySQL Workloads with Performance Insights](https://aws.amazon.com/blogs/database/analyze-amazon-aurora-mysql-workloads-with-performance-insights/).

### Examine other wait events


Examine the other wait events associated with the `synch/mutex/innodb/trx_sys_mutex` wait event. Doing this can provide more information about the nature of the workload. A large number of transactions might reduce throughput, but the workload might also make this necessary.

For more information on how to optimize transactions, see [Optimizing InnoDB Transaction Management](https://dev.mysql.com/doc/refman/5.7/en/optimizing-innodb-transaction-management.html) in the MySQL documentation.

# synch/sxlock/innodb/hash\$1table\$1locks


The `synch/sxlock/innodb/hash_table_locks` event occurs when pages not found in the buffer pool must be read from storage.

**Topics**
+ [

## Supported engine versions
](#ams-waits.sx-lock-hash-table-locks.context.supported)
+ [

## Context
](#ams-waits.sx-lock-hash-table-locks.context)
+ [

## Likely causes of increased waits
](#ams-waits.sx-lock-hash-table-locks.causes)
+ [

## Actions
](#ams-waits.sx-lock-hash-table-locks.actions)

## Supported engine versions


This wait event information is supported for the following versions:
+ Aurora MySQL versions 2 and 3

## Context


The event `synch/sxlock/innodb/hash_table_locks` indicates that a workload is frequently accessing data that isn't stored in the buffer pool. This wait event is associated with new page additions and old data evictions from the buffer pool. The data stored in the buffer pool aged and new data must be cached, so the aged pages are evicted to allow caching of the new pages. MySQL uses a least recently used (LRU) algorithm to evict pages from the buffer pool. The workload is trying to access data that hasn't been loaded into the buffer pool or data that has been evicted from the buffer pool.

This wait event occurs when the workload must access the data in files on disk or when blocks are freed from or added to the buffer pool's LRU list. These operations wait to obtain a shared excluded lock (SX-lock). This SX-lock is used for the synchronization over the *hash table*, which is a table in memory designed to improve buffer pool access performance.

For more information, see [Buffer Pool](https://dev.mysql.com/doc/refman/5.7/en/innodb-buffer-pool.html) in the MySQL documentation.

## Likely causes of increased waits


When the `synch/sxlock/innodb/hash_table_locks` wait event appears more than normal, possibly indicating a performance problem, typical causes include the following:

**An undersized buffer pool**  
The size of the buffer pool is too small to keep all of the frequently accessed pages in memory.

**Heavy workload**  
The workload is causing frequent evictions and data pages reloads in the buffer cache.

**Errors reading the pages**  
There are errors reading pages in the buffer pool, which might indicate data corruption.

## Actions


We recommend different actions depending on the causes of your wait event.

**Topics**
+ [

### Increase the size of the buffer pool
](#ams-waits.sx-lock-hash-table-locks.actions.increase-buffer-pool-size)
+ [

### Improve data access patterns
](#ams-waits.sx-lock-hash-table-locks.actions.improve-data-access-patterns)
+ [

### Reduce or avoid full-table scans
](#ams-waits.sx-lock-hash-table-locks.actions.reduce-full-table-scans)
+ [

### Check the error logs for page corruption
](#ams-waits.sx-lock-hash-table-locks.actions.check-error-logs)

### Increase the size of the buffer pool


Make sure that the buffer pool is appropriately sized for the workload. To do so, you can check the buffer pool cache hit rate. Typically, if the value drops below 95 percent, consider increasing the buffer pool size. A larger buffer pool can keep frequently accessed pages in memory longer. To increase the size of the buffer pool, modify the value of the `innodb_buffer_pool_size` parameter. The default value of this parameter is based on the DB instance class size. For more information, see [ Best practices for Amazon Aurora MySQL database configuration](https://aws.amazon.com/blogs/database/best-practices-for-amazon-aurora-mysql-database-configuration/).

### Improve data access patterns


Check the queries affected by this wait and their execution plans. Consider improving data access patterns. For example, if you are using [mysqli\$1result::fetch\$1array](https://www.php.net/manual/en/mysqli-result.fetch-array.php), you can try increasing the array fetch size.

You can use Performance Insights to show queries and sessions that might be causing the `synch/sxlock/innodb/hash_table_locks` wait event.

**To find SQL queries that are responsible for high load**

1. Sign in to the AWS Management Console and open the Amazon RDS console at [https://console.aws.amazon.com/rds/](https://console.aws.amazon.com/rds/).

1. In the navigation pane, choose **Performance Insights**.

1. Choose a DB instance. The Performance Insights dashboard is shown for that DB instance.

1. In the **Database load** chart, choose **Slice by wait**.

1. At the bottom of the page, choose **Top SQL**.

   The chart lists the SQL queries that are responsible for the load. Those at the top of the list are most responsible. To resolve a bottleneck, focus on these statements.

For a useful overview of troubleshooting using Performance Insights, see the AWS Database Blog post [Analyze Amazon Aurora MySQL Workloads with Performance Insights](https://aws.amazon.com/blogs/database/analyze-amazon-aurora-mysql-workloads-with-performance-insights/).

### Reduce or avoid full-table scans


Monitor your workload to see if it's running full-table scans, and, if it is, reduce or avoid them. For example, you can monitor status variables such as `Handler_read_rnd_next`. For more information, see [Server Status Variables](https://dev.mysql.com/doc/refman/5.7/en/server-status-variables.html#statvar_Handler_read_rnd_next) in the MySQL documentation.

### Check the error logs for page corruption


You can check the mysql-error.log for corruption-related messages that were detected near the time of the issue. Messages that you can work with to resolve the issue are in the error log. You might need to recreate objects that were reported as corrupted.

# synch/mutex/innodb/temp\$1pool\$1manager\$1mutex


The `synch/mutex/innodb/temp_pool_manager_mutex` wait event occurs when a session is waiting to acquire a mutex for managing the pool of session temporary tablespaces.

**Topics**
+ [

## Supported engine versions
](#ams-waits.io-temppoolmanager.context.supported)
+ [

## Context
](#ams-waits.io-temppoolmanager.context)
+ [

## Likely causes of increased waits
](#ams-waits.io-temppoolmanager.causes)
+ [

## Actions
](#ams-waits.io-temppoolmanager.actions)

## Supported engine versions


This wait event information is supported for the following engine versions:
+ Aurora MySQL version 3

## Context


Aurora MySQL version 3.x and higher uses `temp_pool_manager_mutex` to control multiple sessions accessing the temporary tablespace pool at the same time. Aurora MySQL manages storage through an Aurora cluster volume for persistent data and local storage for temporary files. A temporary tablespace is needed when a session creates a temporary table on the Aurora cluster volume. 

When a session first requests a temporary tablespace, MySQL allocates session temporary tablespaces from the shared pool. A session can hold up to 2 temporary tablespaces at a time for the following table types:
+ User-created temporary tables
+ Optimizer-generated internal temporary tables

The default `TempTable` engine uses the following overflow mechanism to handle temporary tables:
+ Stores tables in RAM up to the [https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_temptable_max_ram](https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_temptable_max_ram) limit.
+ Moves to memory-mapped files on local storage when RAM is full.
+ Uses the shared cluster volume when memory-mapped files reach their [https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_temptable_max_mmap](https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_temptable_max_mmap) limit.

After temporary tables exceed both RAM and local storage limits, MySQL manages them using on-disk tablespace.

When a session requires an on-disk temporary table, MySQL:
+ Looks for available `INACTIVE` tablespaces in the pool to reuse.
+ Creates 10 new tablespaces if no `INACTIVE` spaces exist.

When a session disconnects, MySQL:
+ Truncates the session's temporary tablespaces.
+ Marks them as INACTIVE in the pool for reuse.
+ Maintains the current pool size until server restart.
+ Returns to the default pool size (10 tablespaces) after restart.

## Likely causes of increased waits


Common situations that cause this wait event:
+ Concurrent sessions creating internal temporary tables on the cluster volume.
+ Concurrent sessions creating user temporary tables on the cluster volume.
+ Sudden termination of sessions using active tablespaces.
+ Tablespace pool expansion during heavy write workloads.
+ Concurrent queries accessing `INFORMATION_SCHEMA.`

## Actions


We recommend different actions depending on the causes of your wait event.

**Topics**
+ [

### Monitor and optimize temporary table usage
](#ams-waits.io-temppoolmanager.actions.monitor)
+ [

### Review queries using INFORMATION\$1SCHEMA
](#ams-waits.io-temppoolmanager.actions.schema-queries)
+ [

### Increase innodb\$1sync\$1array\$1size parameter
](#ams-waits.io-temppoolmanager.actions.sync_array)
+ [

### Implement connection pooling
](#ams-waits.io-temppoolmanager.actions.connection_pooling)

### Monitor and optimize temporary table usage


To monitor and optimize temporary table usage, use one of these methods:
+ Check the `Created_tmp_disk_tables` counter in Performance Insights to track on-disk temporary table creation across your Aurora cluster.
+ Run this command in your database to directly monitor temporary table creation: `mysql> show status like '%created_tmp_disk%'`.

**Note**  
Temporary table behavior differs between Aurora MySQL reader nodes and writer nodes. For more information, see [New temporary table behavior in Aurora MySQL version 3](ams3-temptable-behavior.md).

After identifying queries creating temporary tables, take these optimization steps:
+ Use `EXPLAIN` to examine query execution plans and identify where and why temporary tables are being created.
+ Modify queries to reduce temporary table usage where possible.

If query optimization alone doesn't resolve performance issues, consider adjusting these configuration parameters:
+  [https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_temptable_max_ram](https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_temptable_max_ram)- Controls maximum RAM usage for temporary tables.
+  [https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_temptable_max_mmap](https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_temptable_max_mmap) - Sets the limit for memory-mapped file storage.
+ [https://dev.mysql.com/doc/refman/8.4/en/server-system-variables.html#sysvar_tmp_table_size](https://dev.mysql.com/doc/refman/8.4/en/server-system-variables.html#sysvar_tmp_table_size)- Applies when `aurora_tmptable_enable_per_table_limit` is enabled (disabled by default).

**Important**  
Note that certain query conditions will always require on-disk temporary tables, regardless of configuration settings. For more information `TempTable` storage engine, see [ Use the TempTable storage engine on Amazon RDS for MySQL and Amazon Aurora MySQL ](https://aws.amazon.com/blogs/database/use-the-temptable-storage-engine-on-amazon-rds-for-mysql-and-amazon-aurora-mysql/).

### Review queries using INFORMATION\$1SCHEMA


When you query `INFORMATION_SCHEMA` tables, MySQL creates InnoDB temporary tables on the cluster volume. Each session needs a temporary tablespace for these tables, which can lead to performance issues during high concurrent access.

To improve performance:
+ Use `PERFORMANCE_SCHEMA` instead of `INFORMATION_SCHEMA` where possible.
+ If you must use `INFORMATION_SCHEMA`, reduce how often you run these queries.

### Increase innodb\$1sync\$1array\$1size parameter


The `innodb_sync_array_size` parameter controls the size of the mutex/lock wait array in MySQL. The default value of `1` works for general workloads, but increasing it can reduce thread contention during high concurrency.

When your workload shows increasing numbers of waiting threads:
+ Monitor the number of waiting threads in your workload.
+ Set `innodb_sync_array_size` equal to or higher than your instance's vCPU count to split the thread coordination structure and reduce contention.

**Note**  
To determine the number of vCPUs available on your RDS instance, see the vCPU specifications in [ Amazon RDS instance types ](https://aws.amazon.com/rds/instance-types/).

### Implement connection pooling


MySQL assigns a dedicated tablespace to each session that creates a temporary table. This tablespace remains active until the database connection ends. To manage your resources more efficiently:
+ Implement connection pooling to limit the number of active temporary tablespaces.
+ Reuse existing connections instead of creating new ones for each operation.

# Tuning Aurora MySQL with thread states


The following table summarizes the most common general thread states for Aurora MySQL.


| General thread state | Description | 
| --- | --- | 
|  [creating sort index](ams-states.sort-index.md)  |  This thread state indicates that a thread is processing a `SELECT` statement that requires the use of an internal temporary table to sort the data.  | 
|  [sending data](ams-states.sending-data.md)  |  This thread state indicates that a thread is reading and filtering rows for a query to determine the correct result set.  | 

# creating sort index


The `creating sort index` thread state indicates that a thread is processing a `SELECT` statement that requires the use of an internal temporary table to sort the data.

**Topics**
+ [

## Supported engine versions
](#ams-states.sort-index.context.supported)
+ [

## Context
](#ams-states.sort-index.context)
+ [

## Likely causes of increased waits
](#ams-states.sort-index.causes)
+ [

## Actions
](#ams-states.sort-index.actions)

## Supported engine versions


This thread state information is supported for the following versions:
+ Aurora MySQL version 2 up to 2.09.2

## Context


The `creating sort index` state appears when a query with an `ORDER BY` or `GROUP BY` clause can't use an existing index to perform the operation. In this case, MySQL needs to perform a more expensive `filesort` operation. This operation is typically performed in memory if the result set isn't too large. Otherwise, it involves creating a file on disk.

## Likely causes of increased waits


The appearance of `creating sort index` doesn't by itself indicate a problem. If performance is poor, and you see frequent instances of `creating sort index`, the most likely cause is slow queries with `ORDER BY` or `GROUP BY` operators.

## Actions


The general guideline is to find queries with `ORDER BY` or `GROUP BY` clauses that are associated with the increases in the `creating sort index` state. Then see whether adding an index or increasing the sort buffer size solves the problem.

**Topics**
+ [

### Turn on the Performance Schema if it isn't turned on
](#ams-states.sort-index.actions.enable-pfs)
+ [

### Identify the problem queries
](#ams-states.sort-index.actions.identify)
+ [

### Examine the explain plans for filesort usage
](#ams-states.sort-index.actions.plan)
+ [

### Increase the sort buffer size
](#ams-states.sort-index.actions.increasebuffersize)

### Turn on the Performance Schema if it isn't turned on


Performance Insights reports thread states only if Performance Schema instruments aren't turned on. When Performance Schema instruments are turned on, Performance Insights reports wait events instead. Performance Schema instruments provide additional insights and better tools when you investigate potential performance problems. Therefore, we recommend that you turn on the Performance Schema. For more information, see [Overview of the Performance Schema for Performance Insights on Aurora MySQL](USER_PerfInsights.EnableMySQL.md).

### Identify the problem queries


To identify current queries that are causing increases in the `creating sort index` state, run `show processlist` and see if any of the queries have `ORDER BY` or `GROUP BY`. Optionally, run `explain for connection N`, where `N` is the process list ID of the query with `filesort`.

To identify past queries that are causing these increases, turn on the slow query log and find the queries with `ORDER BY`. Run `EXPLAIN` on the slow queries and look for "using filesort." For more information, see [Examine the explain plans for filesort usage](#ams-states.sort-index.actions.plan).

### Examine the explain plans for filesort usage


Identify the statements with `ORDER BY` or `GROUP BY` clauses that result in the `creating sort index` state. 

The following example shows how to run `explain` on a query. The `Extra` column shows that this query uses `filesort`.

```
mysql> explain select * from mytable order by c1 limit 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: mytable
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2064548
     filtered: 100.00
        Extra: Using filesort
1 row in set, 1 warning (0.01 sec)
```

The following example shows the result of running `EXPLAIN` on the same query after an index is created on column `c1`.

```
mysql> alter table mytable add index (c1);
```

```
mysql> explain select * from mytable order by c1 limit 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: mytable
   partitions: NULL
         type: index
possible_keys: NULL
          key: c1
      key_len: 1023
          ref: NULL
         rows: 10
     filtered: 100.00
        Extra: Using index
1 row in set, 1 warning (0.01 sec)
```

For information on using indexes for sort order optimization, see [ORDER BY Optimization](https://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html) in the MySQL documentation.

### Increase the sort buffer size


To see whether a specific query required a `filesort` process that created a file on disk, check the `sort_merge_passes` variable value after running the query. The following shows an example.

```
mysql> show session status like 'sort_merge_passes';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| Sort_merge_passes | 0     |
+-------------------+-------+
1 row in set (0.01 sec)

--- run query
mysql> select * from mytable order by u limit 10; 
--- run status again:

mysql> show session status like 'sort_merge_passes';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| Sort_merge_passes | 0     |
+-------------------+-------+
1 row in set (0.01 sec)
```

If the value of `sort_merge_passes` is high, consider increasing the sort buffer size. Apply the increase at the session level, because increasing it globally can significantly increase the amount of RAM MySQL uses. The following example shows how to change the sort buffer size before running a query. 

```
mysql> set session sort_buffer_size=10*1024*1024;
Query OK, 0 rows affected (0.00 sec)
-- run query
```

# sending data


The `sending data` thread state indicates that a thread is reading and filtering rows for a query to determine the correct result set. The name is misleading because it implies the state is transferring data, not collecting and preparing data to be sent later.

**Topics**
+ [

## Supported engine versions
](#ams-states.sending-data.context.supported)
+ [

## Context
](#ams-states.sending-data.context)
+ [

## Likely causes of increased waits
](#ams-states.sending-data.causes)
+ [

## Actions
](#ams-states.sending-data.actions)

## Supported engine versions


This thread state information is supported for the following versions:
+ Aurora MySQL version 2 up to 2.09.2

## Context


Many thread states are short-lasting. Operations occurring during `sending data` tend to perform large numbers of disk or cache reads. Therefore, `sending data` is often the longest-running state over the lifetime of a given query. This state appears when Aurora MySQL is doing the following:
+ Reading and processing rows for a `SELECT` statement
+ Performing a large number of reads from either disk or memory
+ Completing a full read of all data from a specific query
+ Reading data from a table, an index, or the work of a stored procedure
+ Sorting, grouping, or ordering data

After the `sending data` state finishes preparing the data, the thread state `writing to net` indicates the return of data to the client. Typically, `writing to net` is captured only when the result set is very large or severe network latency is slowing the transfer.

## Likely causes of increased waits


The appearance of `sending data` doesn't by itself indicate a problem. If performance is poor, and you see frequent instances of `sending data`, the most likely causes are as follows.

**Topics**
+ [

### Inefficient query
](#ams-states.sending-data.causes.structure)
+ [

### Suboptimal server configuration
](#ams-states.sending-data.causes.server)

### Inefficient query


In most cases, what's responsible for this state is a query that isn't using an appropriate index to find the result set of a specific query. For example, consider a query reading a 10 million record table for all orders placed in California, where the state column isn't indexed or is poorly indexed. In the latter case, the index might exist, but the optimizer ignores it because of low cardinality.

### Suboptimal server configuration


If several queries appear in the `sending data` state, the database server might be configured poorly. Specifically, the server might have the following issues:
+ The database server doesn't have enough computing capacity: disk I/O, disk type and speed, CPU, or number of CPUs.
+ The server is starved for allocated resources, such as the InnoDB buffer pool for InnoDB tables or the key buffer for MyIsam tables.
+ Per-thread memory settings such as `sort_buffer`, `read_buffer`, and `join_buffer` consume more RAM than required, starving the physical server for memory resources.

## Actions


The general guideline is to find queries that return large numbers of rows by checking the Performance Schema. If logging queries that don't use indexes is turned on, you can also examine the results from the slow logs.

**Topics**
+ [

### Turn on the Performance Schema if it isn't turned on
](#ams-states.sending-data.actions.enable-pfs)
+ [

### Examine memory settings
](#ams-states.sending-data.actions.memory)
+ [

### Examine the explain plans for index usage
](#ams-states.sending-data.actions.plans)
+ [

### Check the volume of data returned
](#ams-states.sending-data.actions.maintenance)
+ [

### Check for concurrency issues
](#ams-states.sending-data.actions.concurrent-queries)
+ [

### Check the structure of your queries
](#ams-states.sending-data.actions.subqueries)

### Turn on the Performance Schema if it isn't turned on


Performance Insights reports thread states only if Performance Schema instruments aren't turned on. When Performance Schema instruments are turned on, Performance Insights reports wait events instead. Performance Schema instruments provide additional insights and better tools when you investigate potential performance problems. Therefore, we recommend that you turn on the Performance Schema. For more information, see [Overview of the Performance Schema for Performance Insights on Aurora MySQL](USER_PerfInsights.EnableMySQL.md).

### Examine memory settings


Examine the memory settings for the primary buffer pools. Make sure that these pools are appropriately sized for the workload. If your database uses multiple buffer pool instances, make sure that they aren't divided into many small buffer pools. Threads can only use one buffer pool at a time.

Make sure that the following memory settings used for each thread are properly sized:
+ read\$1buffer
+ read\$1rnd\$1buffer
+ sort\$1buffer
+ join\$1buffer
+ binlog\$1cache

Unless you have a specific reason to modify the settings, use the default values.

### Examine the explain plans for index usage


For queries in the `sending data` thread state, examine the plan to determine whether appropriate indexes are used. If a query isn't using a useful index, consider adding hints like `USE INDEX` or `FORCE INDEX`. Hints can greatly increase or decrease the time it takes to run a query, so use care before adding them.

### Check the volume of data returned


Check the tables that are being queried and the amount of data that they contain. Can any of this data be archived? In many cases, the cause of poor query execution times isn't the result of the query plan, but the volume of data to be processed. Many developers are very efficient in adding data to a database but seldom consider dataset life cycle in the design and development phases.

Look for queries that perform well in low-volume databases but perform poorly in your current system. Sometimes developers who design specific queries might not realize that these queries are returning 350,000 rows. The developers might have developed the queries in a lower-volume environment with smaller datasets than production environments have.

### Check for concurrency issues


Check whether multiple queries of the same type are running at the same time. Some forms of queries run efficiently when they run alone. However, if similar forms of query run together, or in high volume, they can cause concurrency issues. Often, these issues are caused when the database uses temp tables to render results. A restrictive transaction isolation level can also cause concurrency issues.

If tables are read and written to concurrently, the database might be using locks. To help identify periods of poor performance, examine the use of databases through large-scale batch processes. To see recent locks and rollbacks, examine the output of the `SHOW ENGINE INNODB STATUS` command.

### Check the structure of your queries


Check whether captured queries from these states use subqueries. This type of query often leads to poor performance because the database compiles the results internally and then substitutes them back into the query to render data. This process is an extra step for the database. In many cases, this step can cause poor performance in a highly concurrent loading condition.

Also check whether your queries use large numbers of `ORDER BY` and `GROUP BY` clauses. In such operations, often the database must first form the entire dataset in memory. Then it must order or group it in a specific manner before returning it to the client.

# Tuning Aurora MySQL with Amazon DevOps Guru proactive insights


DevOps Guru proactive insights detect known problematic conditions on your Aurora MySQL DB clusters before they occur. DevOps Guru can do the following:
+ Prevent many common database issues by cross-checking your database configuration against common recommended settings.
+ Alert you to critical issues in your fleet that, if left unchecked, can lead to larger problems later.
+ Alert you to newly discovered problems.

Every proactive insight contains an analysis of the cause of the problem and recommendations for corrective actions.

**Topics**
+ [

# The InnoDB history list length increased significantly
](proactive-insights.history-list.md)
+ [

# Database is creating temporary tables on disk
](proactive-insights.temp-tables.md)

# The InnoDB history list length increased significantly


Starting on *date*, your history list for row changes increased significantly, up to *length* on *db-instance*. This increase affects query and database shutdown performance.

**Topics**
+ [

## Supported engine versions
](#proactive-insights.history-list.context.supported)
+ [

## Context
](#proactive-insights.history-list.context)
+ [

## Likely causes for this issue
](#proactive-insights.history-list.causes)
+ [

## Actions
](#proactive-insights.history-list.actions)
+ [

## Relevant metrics
](#proactive-insights.history-list.metrics)

## Supported engine versions


This insight information is supported for all versions of Aurora MySQL.

## Context


The InnoDB transaction system maintains multiversion concurrency control (MVCC). When a row is modified, the pre-modification version of the data being modified is stored as an undo record in an undo log. Every undo record has a reference to its previous redo record, forming a linked list.

The InnoDB history list is a global list of the undo logs for committed transactions. MySQL uses the history list to purge records and log pages when transactions no longer require the history. The history list length is the total number of undo logs that contain modifications in the history list. Each log contains one or more modifications. If the InnoDB history list length grows too large, indicating a large number of old row versions, queries and database shutdowns become slower.

## Likely causes for this issue


Typical causes of a long history list include the following:
+ Long-running transactions, either read or write
+ A heavy write load

## Actions


We recommend different actions depending on the causes of your insight.

**Topics**
+ [

### Don't begin any operation involving a database shutdown until the InnoDB history list decreases
](#proactive-insights.history-list.actions.no-shutdown)
+ [

### Identify and end long-running transactions
](#proactive-insights.history-list.actions.long-txn)
+ [

### Identify the top hosts and top users by using Performance Insights.
](#proactive-insights.history-list.actions.top-PI)

### Don't begin any operation involving a database shutdown until the InnoDB history list decreases


Because a long InnoDB history list slows database shutdowns, reduce the list size before initiating operations involving a database shutdown. These operations include major version database upgrades.

### Identify and end long-running transactions


You can find long-running transactions by querying `information_schema.innodb_trx`.

**Note**  
Make sure also to look for long-running transactions on read replicas.

**To identify and end long-running transactions**

1. In your SQL client, run the following query:

   ```
   SELECT a.trx_id, 
         a.trx_state, 
         a.trx_started, 
         TIMESTAMPDIFF(SECOND,a.trx_started, now()) as "Seconds Transaction Has Been Open", 
         a.trx_rows_modified, 
         b.USER, 
         b.host, 
         b.db, 
         b.command, 
         b.time, 
         b.state 
   FROM  information_schema.innodb_trx a, 
         information_schema.processlist b 
   WHERE a.trx_mysql_thread_id=b.id
     AND TIMESTAMPDIFF(SECOND,a.trx_started, now()) > 10 
   ORDER BY trx_started
   ```

1. End each long-running transaction with the stored procedure [mysql.rds\$1kill](mysql-stored-proc-ending.md#mysql_rds_kill).

### Identify the top hosts and top users by using Performance Insights.


Optimize transactions so that large numbers of modified rows are immediately committed.

## Relevant metrics


The following metrics are related to this insight:
+ `trx_rseg_history_len` – This counter metric can be viewed in Performance Insights, as well as the `INFORMATION_SCHEMA.INNODB_METRICS` table. For more information, see [InnoDB INFORMATION\$1SCHEMA metrics table](https://dev.mysql.com/doc/refman/8.0/en/innodb-information-schema-metrics-table.html) in the MySQL documentation.
+ `RollbackSegmentHistoryListLength` – This Amazon CloudWatch metric measures the undo logs that record committed transactions with delete-marked records. These records are scheduled to be processed by the InnoDB purge operation. The metric `trx_rseg_history_len` has the same value as `RollbackSegmentHistoryListLength`.
+ `PurgeBoundary` – The transaction number up to which InnoDB purging is allowed. If this CloudWatch metric doesn't advance for extended periods of time, it's a good indication that InnoDB purging is blocked by long-running transactions. To investigate, check the active transactions on your Aurora MySQL DB cluster. This metric is available only for Aurora MySQL version 2.11 and higher, and version 3.08 and higher.
+ `PurgeFinishedPoint` – The transaction number up to which InnoDB purging is performed. This CloudWatch metric can help you examine how fast InnoDB purging is progressing. This metric is available only for Aurora MySQL version 2.11 and higher, and version 3.08 and higher.
+ `TransactionAgeMaximum` – The age of the oldest active running transaction. This CloudWatch metric is available only for Aurora MySQL version 3.08 and higher.
+ `TruncateFinishedPoint` – The transaction number up to which undo truncation is performed. This CloudWatch metric is available only for Aurora MySQL version 2.11 and higher, and version 3.08 and higher.

For more information on the CloudWatch metrics, see [Instance-level metrics for Amazon Aurora](Aurora.AuroraMonitoring.Metrics.md#Aurora.AuroraMySQL.Monitoring.Metrics.instances).

# Database is creating temporary tables on disk


Your recent on-disk temporary table usage increased significantly, up to *percentage*. The database is creating around *number* temporary tables per second. This might impact performance and increase disk operations on *db-instance*.

**Topics**
+ [

## Supported engine versions
](#proactive-insights.temp-tables.context.supported)
+ [

## Context
](#proactive-insights.temp-tables.context)
+ [

## Likely causes for this issue
](#proactive-insights.temp-tables.causes)
+ [

## Actions
](#proactive-insights.temp-tables.actions)
+ [

## Relevant metrics
](#proactive-insights.temp-tables.metrics)

## Supported engine versions


This insight information is supported for all versions of Aurora MySQL.

## Context


Sometimes it's necessary for the MySQL server to create an internal temporary table while processing a query. Aurora MySQL can hold an internal temporary table in memory, where it can be processed by the TempTable or MEMORY storage engine, or stored on disk by InnoDB. For more information, see [Internal Temporary Table Use in MySQL](https://dev.mysql.com/doc/refman/5.6/en/internal-temporary-tables.html) in the *MySQL Reference Manual*.

## Likely causes for this issue


An increase in on-disk temporary tables indicates the use of complex queries. If the configured memory is insufficient to store temporary tables in memory, Aurora MySQL creates the tables on disk. This can impact performance and increase disk operations.

## Actions


We recommend different actions depending on the causes of your insight.
+ For Aurora MySQL version 3, we recommend that you use the TempTable storage engine.
+ Optimize your queries to return less data by selecting only necessary columns.

  If you turn on the Performance Schema with all `statement` instruments enabled and timed, you can query `SYS.statements_with_temp_tables` to retrieve the list of queries that use temporary tables. For more information, see [Prerequisites for Using the sys Schema](https://dev.mysql.com/doc/refman/8.0/en/sys-schema-prerequisites.html) in the MySQL documentation.
+ Consider indexing columns that are involved in sorting and grouping operations.
+ Rewrite your queries to avoid `BLOB` and `TEXT` columns. These columns always use disk.
+ Tune the following database parameters: `tmp_table_size` and `max_heap_table_size`.

  The default values for these parameters is 16 MiB. When using the MEMORY storage engine for in-memory temporary tables, their maximum size is defined by the `tmp_table_size` or `max_heap_table_size` value, whichever is smaller. When this maximum size is reached, MySQL automatically converts the in-memory internal temporary table to an InnoDB on-disk internal temporary table. For more information, see [Use the TempTable storage engine on Amazon RDS for MySQL and Amazon Aurora MySQL](https://aws.amazon.com/blogs/database/use-the-temptable-storage-engine-on-amazon-rds-for-mysql-and-amazon-aurora-mysql/).
**Note**  
When explicitly creating MEMORY tables with CREATE TABLE, only the `max_heap_table_size` variable determines how large a table can grow. There is also no conversion to an on-disk format.

## Relevant metrics


The following Performance Insights metrics are related to this insight:
+ Created\$1tmp\$1disk\$1tables
+ Created\$1tmp\$1tables

For more information, see [Created\$1tmp\$1disk\$1tables](https://dev.mysql.com/doc/refman/8.0/en/server-status-variables.html#statvar_Created_tmp_disk_tables) in the MySQL documentation.