

# Identify and resolve aggressive vacuum blockers in RDS for PostgreSQL
Identifying vacuum blockers

In PostgreSQL, vacuuming is vital for ensuring database health as it reclaims storage and prevents [transaction ID wraparound](https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND) issues. However, there are times when vacuuming can be prevented from operating as desired, which can result in performance degradation, storage bloat, and even impact availability of your DB instance by transaction ID wraparound. Therefore, identifying and resolving these issues are essential for optimal database performance and availability. Read [Understanding autovacuum in Amazon RDS for PostgreSQL environments](https://aws.amazon.com/blogs/database/understanding-autovacuum-in-amazon-rds-for-postgresql-environments/) to learn more about autovacuum.

The `postgres_get_av_diag()` function helps identify issues that either prevent or delay the aggressive vacuum progress. Suggestions are provided, which may include commands to resolve the issue where it is identifiable or guidance for further diagnostics where the issue is not identifiable. Aggressive vacuum blockers are reported when the age exceeds RDS' [adaptive autovacuum](Appendix.PostgreSQL.CommonDBATasks.Autovacuum.md#Appendix.PostgreSQL.CommonDBATasks.Autovacuum.AdaptiveAutoVacuuming) threshold of 500 million transaction IDs.

**What is the age of the transaction ID?**

The `age()` function for transaction IDs calculates the number of transactions that have occurred since the oldest unfrozen transaction ID for a database (`pg_database.datfrozenxid`) or table (`pg_class.relfrozenxid`). This value indicates database activity since the last aggressive vacuum operation and highlights the likely workload for upcoming VACUUM processes. 

**What is an aggressive vacuum?**

An aggressive VACUUM operation conducts a comprehensive scan of all pages within a table, including those typically skipped during regular VACUUMs. This thorough scan aims to "freeze" transaction IDs approaching their maximum age, effectively preventing a situation known as [transaction ID wraparound](https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND).

For `postgres_get_av_diag()` to report blockers, the blocker must be at least 500 million transactions old.

**Topics**
+ [

# Installing autovacuum monitoring and diagnostic tools in RDS for PostgreSQL
](Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Installation.md)
+ [

# Functions of postgres\$1get\$1av\$1diag() in RDS for PostgreSQL
](Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Functions.md)
+ [

# Resolving identifiable vacuum blockers in RDS for PostgreSQL
](Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Resolving_Identifiableblockers.md)
+ [

# Resolving unidentifiable vacuum blockers in RDS for PostgreSQL
](Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Unidentifiable_blockers.md)
+ [

# Resolving vacuum performance issues in RDS for PostgreSQL
](Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Resolving_Performance.md)
+ [

# Explanation of the NOTICE messages in RDS for PostgreSQL
](Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.NOTICE.md)

# Installing autovacuum monitoring and diagnostic tools in RDS for PostgreSQL
Installing autovacuum monitoring tools

The `postgres_get_av_diag()` function is currently available in the following RDS for PostgreSQL versions:
+ 17.2 and higher 17 versions
+ 16.7 and higher 16 versions
+ 15.11 and higher 15 versions
+ 14.16 and higher 14 versions
+ 13.19 and higher 13 versions

 In order to use `postgres_get_av_diag()`, create the `rds_tools` extension.

```
postgres=> CREATE EXTENSION rds_tools ;
CREATE EXTENSION
```

Verify that the extension is installed.

```
postgres=> \dx rds_tools
             List of installed extensions
   Name    | Version |  Schema   |                    Description
 ----------+---------+-----------+----------------------------------------------------------
 rds_tools |   1.8   | rds_tools | miscellaneous administrative functions for RDS PostgreSQL
 1 row
```

Verify that the function is created.

```
postgres=> SELECT
    proname function_name,
    pronamespace::regnamespace function_schema,
    proowner::regrole function_owner
FROM
    pg_proc
WHERE
    proname = 'postgres_get_av_diag';
    function_name     | function_schema | function_owner
----------------------+-----------------+----------------
 postgres_get_av_diag | rds_tools       | rds_superuser
(1 row)
```

# Functions of postgres\$1get\$1av\$1diag() in RDS for PostgreSQL
Functions of postgres\$1get\$1av\$1diag()

The `postgres_get_av_diag()` function retrieves diagnostic information about autovacuum processes that are blocking or lagging behind in a RDS for PostgreSQL database. The query needs to be executed in the database with the oldest transaction ID for accurate results. For more information about using the database with the oldest transaction ID, see [Not connected to the database with the age of oldest transaction ID](Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.NOTICE.md)

```
SELECT
    blocker,
    DATABASE,
    blocker_identifier,
    wait_event,
    TO_CHAR(autovacuum_lagging_by, 'FM9,999,999,999') AS autovacuum_lagging_by,
    suggestion,
    suggested_action
FROM (
    SELECT
        *
    FROM
        rds_tools.postgres_get_av_diag ()
    ORDER BY
        autovacuum_lagging_by DESC) q;
```

The `postgres_get_av_diag()` function returns a table with the following information:

**blocker**  
Specifies the category of database activity that is blocking the vacuum.  
+ [Active statement](Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Resolving_Identifiableblockers.md#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Active_statement)
+ [Idle in transaction](Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Resolving_Identifiableblockers.md#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Idle_in_transaction)
+ [Prepared transaction](Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Resolving_Identifiableblockers.md#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Prepared_transaction)
+ [Logical replication slot](Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Resolving_Identifiableblockers.md#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Logical_replication_slot)
+ [Read replica with physical replication slot](Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Resolving_Identifiableblockers.md#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Read_replicas)
+ [Read replica with streaming replication](Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Resolving_Identifiableblockers.md#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Read_replicas)
+ [Temporary tables](Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Resolving_Identifiableblockers.md#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Temporary_tables)

**database**  
Specifies the name of the database where applicable and supported. This is the database in which the activity is ongoing and blocking or will block the autovacuum. This is the database you are required to connect to and take action.

**blocker\$1identifier**  
Specifies the identifier of the activity that is blocking or will block the autovacuum. The identifier can be a process ID along with a SQL statement, a prepared transaction, an IP address of a read replica, and the name of the replication slot, either logical or physical.

**wait\$1event**  
Specifies the [wait event](PostgreSQL.Tuning.md) of the blocking session and is applicable for the following blockers:  
+ Active statement
+ Idle in transaction

**autovacum\$1lagging\$1by**  
Specifies the number of transactions that autovacuum is lagging behind in its backlog work per category.

**suggestion**  
Specifies suggestions to resolve the blocker. These instructions include the name of the database in which the activity exists where applicable, the Process ID (PID) of the session where applicable, and the action to be taken.

**suggested\$1action**  
Suggests the action that needs to be taken to resolve the blocker.

# Resolving identifiable vacuum blockers in RDS for PostgreSQL
Resolving identifiable vacuum blockers

Autovacuum performs aggressive vacuums and lowers the age of transaction IDs to below the threshold specified by the `autovacuum_freeze_max_age` parameter of your RDS instance. You can track this age using the Amazon CloudWatch metric `MaximumUsedTransactionIDs`.

To find the setting of `autovacuum_freeze_max_age` (which has a default of 200 million transaction IDs) for your Amazon RDS instance, you can use the following query:

```
SELECT
    TO_CHAR(setting::bigint, 'FM9,999,999,999') autovacuum_freeze_max_age
FROM
    pg_settings
WHERE
    name = 'autovacuum_freeze_max_age';
```

Note that `postgres_get_av_diag()` only checks for aggressive vacuum blockers when the age exceeds Amazon RDS’ [adaptive autovacuum](Appendix.PostgreSQL.CommonDBATasks.Autovacuum.md#Appendix.PostgreSQL.CommonDBATasks.Autovacuum.AdaptiveAutoVacuuming) threshold of 500 million transaction IDs. For `postgres_get_av_diag()` to detect blockers, the blocker must be at least 500 million transactions old.

The `postgres_get_av_diag()` function identifies the following types of blockers:

**Topics**
+ [

## Active statement
](#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Active_statement)
+ [

## Idle in transaction
](#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Idle_in_transaction)
+ [

## Prepared transaction
](#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Prepared_transaction)
+ [

## Logical replication slot
](#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Logical_replication_slot)
+ [

## Read replicas
](#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Read_replicas)
+ [

## Temporary tables
](#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Temporary_tables)

## Active statement


In PostgreSQL, an active statement is an SQL statement that is currently being executed by the database. This includes queries, transactions, or any operations in progress. When monitoring via `pg_stat_activity`, the state column indicates that the process with the corresponding PID is active.

The `postgres_get_av_diag()` function displays output similar to the following when it identifies a statement that is an active statement.

```
blocker               | Active statement
database              | my_database
blocker_identifier    | SELECT pg_sleep(20000);
wait_event            | Timeout:PgSleep
autovacuum_lagging_by | 568,600,871
suggestion            | Connect to database "my_database", review carefully and you may consider terminating the process using suggested_action. For more information, see Working with PostgreSQL autovacuum in the Amazon RDS User Guide.
suggested_action      | {"SELECT pg_terminate_backend (29621);"}
```

**Suggested action**

Following the guidance in the `suggestion` column, the user can connect to the database where the active statement is present and, as specified in the `suggested_action` column, it's advisable to carefully review the option to terminate the session. If termination is safe, you may use the `pg_terminate_backend()` function to terminate the session. This action can be performed by an administrator (such as the RDS master account) or a user with the required `pg_terminate_backend()` privilege.

**Warning**  
A terminated session will undo (`ROLLBACK`) changes it made. Depending on your requirements, you may want to rerun the statement. However, it is recommended to do so only after the autovacuum process has finished its aggressive vacuum operation.

## Idle in transaction


An idle in transaction statement refers to any session that has opened an explicit transaction (such as by issuing a `BEGIN` statement), performed some work, and is now waiting for the client to either pass more work or signal the end of the transaction by issuing a `COMMIT`, `ROLLBACK`, or `END` (which would result in an implicit `COMMIT`).

The `postgres_get_av_diag()` function displays output similar to the following when it identifies an `idle in transaction` statement as a blocker.

```
blocker               | idle in transaction
database              | my_database
blocker_identifier    | INSERT INTO tt SELECT * FROM tt;
wait_event            | Client:ClientRead
autovacuum_lagging_by | 1,237,201,759
suggestion            | Connect to database "my_database", review carefully and you may consider terminating the process using suggested_action. For more information, see Working with PostgreSQL autovacuum in the Amazon RDS User Guide.
suggested_action      | {"SELECT pg_terminate_backend (28438);"}
```

**Suggested action**

As indicated in the `suggestion` column, you can connect to the database where the idle in transaction session is present and terminate the session using the `pg_terminate_backend()` function. The user can be your admin (RDS master account) user or a user with the `pg_terminate_backend()` privilege.

**Warning**  
A terminated session will undo (`ROLLBACK`) changes it made. Depending on your requirements, you may want to rerun the statement. However, it is recommended to do so only after the autovacuum process has finished its aggressive vacuum operation.

## Prepared transaction


PostgreSQL allows transactions that are part of a two-phase commit strategy called [prepared transactions](https://www.postgresql.org/docs/current/sql-prepare-transaction.html). These are enabled by setting the `max_prepared_transactions` parameter to a non-zero value. Prepared transactions are designed to ensure that a transaction is durable and remains available even after database crashes, restarts, or client disconnections. Like regular transactions, they are assigned a transaction ID and can affect the autovacuum. If left in a prepared state, autovacuum cannot perform freeezing and it can lead to transaction ID wraparound.

When transactions are left prepared indefinitely without being resolved by a transaction manager, they become orphaned prepared transactions. The only way to fix this is to either commit or rollback the transaction using the `COMMIT PREPARED` or `ROLLBACK PREPARED` commands, respectively.

**Note**  
Be aware that a backup taken during a prepared transaction will still contain that transaction after restoration. Refer to the following information about how to locate and close such transactions.

The `postgres_get_av_diag()` function displays the following output when it identifies a blocker that is a prepared transaction.

```
blocker               | Prepared transaction
database              | my_database
blocker_identifier    | myptx
wait_event            | Not applicable
autovacuum_lagging_by | 1,805,802,632
suggestion            | Connect to database "my_database" and consider either COMMIT or ROLLBACK the prepared transaction using suggested_action. For more information, see Working with PostgreSQL autovacuum in the Amazon RDS User Guide.
suggested_action      | {"COMMIT PREPARED 'myptx';",[OR],"ROLLBACK PREPARED 'myptx';"}
```

**Suggested action**

As mentioned in the suggestion column, connect to the database where the prepared transaction is located. Based on the `suggested_action` column, carefully review whether to perform either `COMMIT` or `ROLLBACK`, and the the appropiate the action.

To monitor prepared transactions in general, PostgreSQL offers a catalog view called `pg_prepared_xacts`. You can use the following query to find prepared transactions.

```
SELECT
    gid,
    prepared,
    owner,
    database,
    transaction AS oldest_xmin
FROM
    pg_prepared_xacts
ORDER BY
    age(transaction) DESC;
```

## Logical replication slot


The purpose of a replication slot is to hold unconsumed changes until they are replicated to a target server. For more information, see PostgreSQL's [Logical replication](https://www.postgresql.org/docs/current/logical-replication.html).

There are two types of logical replication slots.

**Inactive logical replication slots**

When replication is terminated, unconsumed transaction logs can't be removed, and the replication slot becomes inactive. Although an inactive logical replication slot isn't currently used by a subscriber, it remains on the server, leading to the retention of WAL files and preventing the removal of old transaction logs. This can increase disk usage and specifically block autovacuum from cleaning up internal catalog tables, as the system must preserve LSN information from being overwritten. If not addressed, this can result in catalog bloat, performance degradation, and an increased risk of wraparound vacuum, potentially causing transaction downtime.

**Active but slow logical replication slots**

Sometimes removal of dead tuples of catalog is delayed due to the performance degradation of logical replication. This delay in replication slows down updating the `catalog_xmin` and can lead to catalog bloat and wraparound vacuum.

The `postgres_get_av_diag()` function displays output similar to the following when it finds a logical replication slot as a blocker.

```
blocker               | Logical replication slot
database              | my_database
blocker_identifier    | slot1
wait_event            | Not applicable
autovacuum_lagging_by | 1,940,103,068
suggestion            | Ensure replication is active and resolve any lag for the slot if active. If inactive, consider dropping it using the command in suggested_action. For more information, see Working with PostgreSQL autovacuum in the Amazon RDS User Guide.
suggested_action      | {"SELECT pg_drop_replication_slot('slot1') FROM pg_replication_slots WHERE active = 'f';"}
```

**Suggested action**

To resolve this problem, check the replication configuration for issues with the target schema or data that might be terminating the apply process. The most common reasons are the following: 
+ Missing columns
+ Incompatible data type
+ Data mismatch
+ Missing table

If the problem is related to infrastructure issues:
+ Network issues - [How do I resolve issues with an Amazon RDS DB in an incompatible network state?](https://repost.aws/knowledge-center/rds-incompatible-network).
+ Database or DB instance is not available due to the following reasons:
  + Replica instance is out of storage - Review [Amazon RDS DB instances run out of storage](https://repost.aws/knowledge-center/rds-out-of-storage) for information about adding storage.
  + Incompatible-parameters - Review [How can I fix an Amazon RDS DB instance that is stuck in the incompatible-parameters status?](https://repost.aws/knowledge-center/rds-incompatible-parameters) for more information about how you can resolve the issue.

If your instance is outside the AWS network or on AWS EC2, consult your administrator on how to resolve the availability or infrastructure-related issues.

**Dropping the inactive slot**

**Warning**  
Caution: Before dropping a replication slot, carefully ensure that it has no ongoing replication, is inactive, and is in an unrecoverable state. Dropping a slot prematurely could disrupt replication or cause data loss.

After confirming that the replication slot is no longer needed, drop it to allow autovacuum to continue. The condition `active = 'f'` ensures that only an inactive slot is dropped.

```
SELECT pg_drop_replication_slot('slot1') WHERE active ='f'
```

## Read replicas


When the `hot_standby_feedback` setting is enabled for [Amazon RDS read replicas](USER_PostgreSQL.Replication.ReadReplicas.md), it prevents autovacuum on the primary database from removing dead rows that might still be needed by queries running on the read replica. This affects all types of physical read replicas including those managed with or without replication slots. This behavior is necessary because queries running on the standby replica require those rows to remain available on the primary preventing [query conflicts](https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-CONFLICT) and cancellations.

**Read replica with physical replication slot**  
Read replicas with physical replication slots significantly enhance the reliability and stability of replication in RDS for PostgreSQL. These slots ensure the primary database retains essential Write-Ahead Log files until the replica processes them, maintaining data consistency even during network disruptions.

Beginning with RDS for PostgreSQL version 14, all replicas utilize replication slots. In earlier versions, only cross-Region replicas used replication slots.

The `postgres_get_av_diag()` function displays output similar to the following when it finds a read replica with physical replication slot as the blocker.

```
blocker               | Read replica with physical replication slot
database              |
blocker_identifier    | rds_us_west_2_db_xxxxxxxxxxxxxxxxxxxxx
wait_event            | Not applicable
autovacuum_lagging_by | 554,080,689
suggestion            | Run the following query on the replica "rds_us_west_2_db_xxxxxxxxxxxxxxxxxxxx" to find the long running query:                           
                      | SELECT * FROM pg_catalog.pg_stat_activity WHERE backend_xmin::text::bigint = 757989377;                                                       
                      | Review carefully and you may consdier terminating the query on read replica using suggested_action. For more information, see Working with PostgreSQL autovacuum in the Amazon RDS User Guide.                                 +                      |
suggested_action      | {"SELECT pg_terminate_backend(pid) FROM pg_catalog.pg_stat_activity WHERE backend_xmin::text::bigint = 757989377;","                                                                                 +
                      | [OR]                                                                                                                                                                                                 +
                      | ","Disable hot_standby_feedback","                                                                                                                                                                   +
                      | [OR]                                                                                                                                                                                                 +
                      | ","Delete the read replica if not needed"}
```

**Read replica with streaming replication**  
Amazon RDS allows setting up read replicas without a physical replication slot in older versions, up to version 13. This approach reduces overhead by allowing the primary to recycle WAL files more aggressively, which is advantageous in environments with limited disk space and can tolerate occasional ReplicaLag. However, without a slot, the standby must remain in sync to avoid missing WAL files. Amazon RDS uses archived WAL files to help the replica catch up if it falls behind, but this process requires careful monitoring and can be slow.

The `postgres_get_av_diag()` function displays output similar to the following when it finds a streaming read replica as the blocker.

```
blocker               | Read replica with streaming replication slot
database              | Not applicable
blocker_identifier    | xx.x.x.xxx/xx
wait_event            | Not applicable
autovacuum_lagging_by | 610,146,760
suggestion            | Run the following query on the replica "xx.x.x.xxx" to find the long running query:                                                                                                                                                         +
                      | SELECT * FROM pg_catalog.pg_stat_activity WHERE backend_xmin::text::bigint = 348319343;                                                                                                                                                     +
                      | Review carefully and you may consdier terminating the query on read replica using suggested_action. For more information, see Working with PostgreSQL autovacuum in the Amazon RDS User Guide.                                       +
                      |
suggested_action      | {"SELECT pg_terminate_backend(pid) FROM pg_catalog.pg_stat_activity WHERE backend_xmin::text::bigint = 348319343;","                                                                                                                        +
                      | [OR]                                                                                                                                                                                                                                        +
                      | ","Disable hot_standby_feedback","                                                                                                                                                                                                          +
                      | [OR]                                                                                                                                                                                                                                        +
                      | ","Delete the read replica if not needed"}
```

**Suggested action**

As recommended in the `suggested_action` column, carefully review these options to unblock autovacuum.
+ **Terminate the query** – Following the guidance in the suggestion column, you can connect to the read replica, as specified in the suggested\$1action column, it's advisable to carefully review the option to terminate the session. If termination is deemed safe, you may use the `pg_terminate_backend()` function to terminate the session. This action can be performed by an administrator (such as the RDS master account) or a user with the required pg\$1terminate\$1backend() privilege.

  You may run the following SQL command on the read replica to terminate the query that is preventing the vacuum on the primary from cleaning up old rows. The value of `backend_xmin` is reported in the function’s output:

  ```
  SELECT
      pg_terminate_backend(pid)
  FROM
      pg_catalog.pg_stat_activity
  WHERE
      backend_xmin::text::bigint = backend_xmin;
  ```
+ **Disable hot standby feedback** – Consider disabling the `hot_standby_feedback` parameter if it's causing significant vacuum delays.

  The `hot_standby_feedback` parameter allows a read replica to inform the primary about its query activity, preventing the primary from vacuuming tables or rows that are in use on the standby. While this ensures query stability on the standby, it can significantly delay vacuuming on the primary. Disabling this feature allows the primary to proceed with vacuuming without waiting for the standby to catch up. However, this can lead to query cancellations or failures on the standby if it attempts to access rows that have been vacuumed by the primary.
+ **Delete the read replica if not needed** – If the read replica is no longer necessary, you can delete it. This will remove the associated replication overhead and allow the primary to recycle transaction logs without being held back by the replica.

## Temporary tables


[Temporary tables](https://www.postgresql.org/docs/current/sql-createtable.html), created using the `TEMPORARY` keyword, reside in the temp schema, for example pg\$1temp\$1xxx, and are only accessible to the session that created them. Temporary tables are dropped when the session ends. However, these tables are invisible to PostgreSQL's autovacuum process, and must be manually vacuumed by the session that created them. Trying to vacuum the temp table from another session has no effect.

In unusual circumstances, a temporary table exists without an active session owning it. If the owning session ends unexpectedly due to a fatal crash, network issue, or similar event, the temporary table might not be cleaned up, leaving it behind as an "orphaned" table. When the PostgreSQL autovacuum process detects an orphaned temporary table, it logs the following message:

```
LOG: autovacuum: found orphan temp table \"%s\".\"%s\" in database \"%s\"
```

The `postgres_get_av_diag()` function displays output similar to the following when it identifies a temporary table as a blocker. For the function to correctly show the output related to temporary tables, it needs to be executed within the same database where those tables exist.

```
blocker               | Temporary table
database              | my_database
blocker_identifier    | pg_temp_14.ttemp
wait_event            | Not applicable
autovacuum_lagging_by | 1,805,802,632
suggestion            | Connect to database "my_database". Review carefully, you may consider dropping temporary table using command in suggested_action. For more information, see Working with PostgreSQL autovacuum in the Amazon RDS User Guide.
suggested_action      | {"DROP TABLE ttemp;"}
```

**Suggested action**

Follow the instructions provided in the `suggestion` column of the output to identify and remove the temporary table that is preventing autovacuum from running. Use the following command to drop the temporary table reported by `postgres_get_av_diag()`. Replace the table name based on the output provided by the `postgres_get_av_diag()` function.

```
DROP TABLE my_temp_schema.my_temp_table;
```

The following query can be used to identify temporary tables:

```
SELECT
    oid,
    relname,
    relnamespace::regnamespace,
    age(relfrozenxid)
FROM
    pg_class
WHERE
relpersistence = 't'
ORDER BY
    age(relfrozenxid) DESC;
```

# Resolving unidentifiable vacuum blockers in RDS for PostgreSQL
Resolving unidentifiable vacuum blockers

This section explores additional reasons that can prevent vacuuming from making progress. These issues are currently not directly identifiable by the `postgres_get_av_diag()` function. 

**Topics**
+ [

## Invalid pages
](#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Invalid_pages)
+ [

## Index inconsistency
](#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Index_inconsistency)
+ [

## Exceptionally high transaction rate
](#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.High_transaction_rate)

## Invalid pages


An invalid page error occurs when PostgreSQL detects a mismatch in a page’s checksum while accessing that page. The contents are unreadable, preventing autovacuum from freezing tuples. This effectively stops the cleanup process. The following error is written into PostgreSQL’s log:

```
WARNING:  page verification failed, calculated checksum YYYYY but expected XXXX
ERROR:  invalid page in block ZZZZZ of relation base/XXXXX/XXXXX
CONTEXT:  automatic vacuum of table myschema.mytable
```

**Determine the object type**

```
ERROR: invalid page in block 4305910 of relation base/16403/186752608 
WARNING: page verification failed, calculated checksum 50065 but expected 60033
```

From the error message, the path `base/16403/186752608` provides the following information:
+ "base" is the directory name under the PostgreSQL data directory.
+ "16403" is the database OID, which you can look up in the `pg_database` system catalog.
+ "186752608" is the `relfilenode`, which you can use to look up the schema and object name in the `pg_class` system catalog.

By checking the output of the following query in the impacted database, you can determine the object type. The following query retrieves object information for oid: 186752608. Replace the OID with the one relevant to the error you encountered.

```
SELECT
    relname AS object_name,
    relkind AS object_type,
    nspname AS schema_name
FROM
    pg_class c
    JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE
    c.oid = 186752608;
```

For more information, see the PostgreSQL documentation [https://www.postgresql.org/docs/current/catalog-pg-class.html](https://www.postgresql.org/docs/current/catalog-pg-class.html) for all the supported object types, noted by the `relkind` column in `pg_class`.

**Guidance**

The most effective solution for this issue depends on the configuration of your specific Amazon RDS instance and the type of data impacted by the inconsistent page.

**If the object type is an index:**

Rebuilding the index is recommended.
+ **Using the `CONCURRENTLY` option** – Prior to PostgreSQL version 12, rebuilding an index required an exclusive table lock, restricting access to the table. With PostgreSQL version 12, and later versions, the `CONCURRENTLY` option allows for row-level locking, significantly improving the table's availability. Following is the command:

  ```
  REINDEX INDEX ix_name CONCURRENTLY;
  ```

  While `CONCURRENTLY` is less disruptive, it can be slower on busy tables. Consider building the index during low-traffic periods if possible.

  For more information, see the PostgreSQL [REINDEX](https://www.postgresql.org/docs/current/sql-reindex.html) documentation.
+ **Using the `INDEX_CLEANUP FALSE` option** – If the indexes are large and estimated to require a significant amount of time to finish, you can unblock autovacuum by executing a manual `VACUUM FREEZE` while excluding indexes. This functionality is available in PostgreSQL version 12 and later versions. 

  Bypassing indexes will allow you to skip the vacuum process of the inconsistent index and mitigate the wraparound issue. However, this will not resolve the underlying invalid page problem. To fully address and resolve the invalid page issue, you will still need to rebuild the index.

**If the object type is a materialized view:**

If an invalid page error occurs on a materialized view, login to the impacted database and refresh it to resolve the invalid page:

Refresh the materialized view:

```
REFRESH MATERIALIZED VIEW schema_name.materialized_view_name;
```

If refreshing fails, try recreating:

```
DROP MATERIALIZED VIEW schema_name.materialized_view_name;
CREATE MATERIALIZED VIEW schema_name.materialized_view_name AS query;
```

Refreshing or recreating the materialized view restores it without impacting the underlying table data.

**For all other object types:**

For all other object types, reach out to AWS support.

## Index inconsistency


A logically inconsistent index can prevent autovacuum from making progress. The following errors or similar errors are logged during either the vacuum phase of the index or when the index is accessed by SQL statements.

```
ERROR: right sibling's left-link doesn't match:block 5 links to 10 instead of expected 2 in index ix_name
```

```
ERROR: failed to re-find parent key in index "XXXXXXXXXX" for deletion target page XXX
CONTEXT:  while vacuuming index index_name of relation schema.table
```

**Guidance**

Rebuild the index or skip indexes using `INDEX_CLEANUP` on manual `VACUUM FREEZE`. For information about how to rebuild the index, see [If the object type is an index](#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Invalid_pages).
+ **Using the CONCURRENTLY option** – Prior to PostgreSQL version 12, rebuilding an index required an exclusive table lock, restricting access to the table. With PostgreSQL version 12, and later versions, the CONCURRENTLY option allows for row-level locking, significantly improving the table's availability. Following is the command:

  ```
  REINDEX INDEX ix_name CONCURRENTLY;
  ```

  While CONCURRENTLY is less disruptive, it can be slower on busy tables. Consider building the index during low-traffic periods if possible. For more information, see [REINDEX](https://www.postgresql.org/docs/current/sql-reindex.html) in *PostgreSQL* documentation.
+ **Using the INDEX\$1CLEANUP FALSE option** – If the indexes are large and estimated to require a significant amount of time to finish, you can unblock autovacuum by executing a manual VACUUM FREEZE while excluding indexes. This functionality is available in PostgreSQL version 12 and later versions.

  Bypassing indexes will allow you to skip the vacuum process of the inconsistent index and mitigate the wraparound issue. However, this will not resolve the underlying invalid page problem. To fully address and resolve the invalid page issue, you will still need to rebuild the index.

## Exceptionally high transaction rate


In PostgreSQL, high transaction rates can significantly impact autovacuum's performance, leading to slower cleanup of dead tuples and increased risk of transaction ID wraparound. You can monitor the transaction rate by measuring the difference in `max(age(datfrozenxid))` between two time periods, typically per second. Additionally, you can use the following counter metrics from RDS Performance Insights to measure the transaction rate (the sum of xact\$1commit and xact\$1rollback) which is the total number of transactions.


|  Counter  |  Type  |  Unit  |  Metric  | 
| --- | --- | --- | --- | 
|  xact\$1commit  |  Transactions  |  Commits per second  |  db.Transactions.xact\$1commit  | 
|  xact\$1rollback  |  Transactions  |  Rollbacks per second  |  db.Transactions.xact\$1rollback  | 

A rapid increase indicates a high transaction load, which can overwhelm autovacuum, causing bloat, lock contention, and potential performance issues. This can negatively impact the autovacuum process in a couple of ways:
+ **Table Activity:** The specific table being vacuumed could be experiencing a high volume of transactions, causing delays.
+ **System Resources** The overall system might be overloaded, making it difficult for autovacuum to access the necessary resources to function efficiently.

Consider the following strategies for allowing autovacuum to operate more effectively and keep up with its tasks:

1. Reduce the transaction rate if possible. Consider to batch or group similar transactions where feasible.

1. Target frequently updated tables with manual `VACUUM FREEZE` operation nightly, weekly, or biweekly during off-peak hours. 

1. Consider scaling up your instance class to allocate more system resources to handle the high transaction volume and autovacuum.

# Resolving vacuum performance issues in RDS for PostgreSQL
Resolving vacuum performance issues

This section discusses factors that often contribute to slower vacuum performance and how to address these issues.

**Topics**
+ [

## Vacuum large indexes
](#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Large_indexes)
+ [

## Too many tables or databases to vacuum
](#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Multiple_tables)
+ [

## Aggressive vacuum (to prevent wraparound) is running
](#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Aggressive_vacuum)

## Vacuum large indexes


VACUUM operates through sequential phases: initialization, heap scanning, index and heap vacuuming, index cleanup, heap truncation, and final cleanup. During the heap scan, the process prunes pages, defragments and freezes them. After completing the heap scan, VACUUM cleans indexes, returns empty pages to the operating system, and performs final cleanup tasks like vacuuming the free space map and updating statistics.

Index vacuuming may require multiple passes when `maintenance_work_mem` (or `autovacuum_work_mem`) is insufficient to process the index. In PostgreSQL 16 and earlier, a 1 GB memory limit for storing dead tuple IDs often forced multiple passes on large indexes. PostgreSQL 17 introduces `TidStore`, which dynamically allocates memory instead of using a single-allocation array. This removes the 1 GB constraint, uses memory more efficiently, and reduces the need for multiple index scans per each index.

Large indexes may still require multiple passes in PostgreSQL 17 if available memory can't accommodate the entire index processing at once. Typically, larger indexes contain more dead tuples that require multiple passes.

**Detecting slow vacuum operations**

The `postgres_get_av_diag()` function can detect when vacuum operations are running slowly due to insufficient memory. For more information on this function, see [Installing autovacuum monitoring and diagnostic tools in RDS for PostgreSQL](Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Installation.md).

The `postgres_get_av_diag()` function issues the following notices when the available memory is not enough to complete the index vacuuming in a single pass.

**`rds_tools` 1.8**

```
NOTICE: Your database is currently running aggressive vacuum to prevent wraparound and it might be slow.
```

```
NOTICE: The current setting of autovacuum_work_mem is "XXX" and might not be sufficient. Consider increasing the setting, and if necessary, scaling up the Amazon RDS instance class for more memory. 
        Additionally, review the possibility of manual vacuum with exclusion of indexes using (VACUUM (INDEX_CLEANUP FALSE, VERBOSE TRUE) table_name;).
```

**`rds_tools` 1.9**

```
NOTICE: Your database is currently running aggressive vacuum to prevent wraparound and it might be slow.
```

```
NOTICE: The current setting of autovacuum_work_mem is XX might not be sufficient. Consider increasing the setting to XXX, and if necessary, scaling up the RDS instance class for more 
        memory. The suggested value is an estimate based on the current number of dead tuples for the table being vacuumed, which might not fully reflect the latest state. Additionally, review the possibility of manual 
        vacuum with exclusion of indexes using (VACUUM (INDEX_CLEANUP FALSE, VERBOSE TRUE) table_name;). For more information, see 
        [Working with PostgreSQL autovacuum in the Amazon Amazon RDS User Guide](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.PostgreSQL.CommonDBATasks.Autovacuum.html)
        .
```

**Note**  
The `postgres_get_av_diag()` function relies on `pg_stat_all_tables.n_dead_tup` for estimating the amount of memory required for index vacuuming.

When the `postgres_get_av_diag()` function identifies a slow vacuum operation that requires multiple index scans due to insufficient `autovacuum_work_mem`, it will generate the following message:

```
NOTICE: Your vacuum is performing multiple index scans due to insufficient autovacuum_work_mem:XXX for index vacuuming. 
        For more information, see [Working with PostgreSQL autovacuum in the Amazon Amazon RDS User Guide](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.PostgreSQL.CommonDBATasks.Autovacuum.html).
```

**Guidance**

You can apply the following workarounds using manual `VACUUM FREEZE` to speed up freezing the table.

**Increase the memory for vacuuming**

As suggested by the `postgres_get_av_diag()` function, it's advisable to increase the `autovacuum_work_mem` parameter to address potential memory constraints at the instance level. While `autovacuum_work_mem` is a dynamic parameter, it's important to note that for the new memory setting to take effect, the autovacuum daemon needs to restart its workers. To accomplish this:

1. Confirm that the new setting is in place.

1. Terminate the processes currently running autovacuum.

This approach ensures that the adjusted memory allocation is applied to new autovacuum operations.

For more immediate results, consider manually performing a `VACUUM FREEZE` operation with an increased `maintenance_work_mem` setting within your session:

```
SET maintenance_work_mem TO '1GB';
VACUUM FREEZE VERBOSE table_name;
```

If you're using Amazon RDS and find that you need additional memory to support higher values for `maintenance_work_mem` or `autovacuum_work_mem`, consider upgrading to an instance class with more memory. This can provide the necessary resources to enhance both manual and automatic vacuum operations, leading to improved overall vacuum and database performance.

**Disable INDEX\$1CLEANUP**

Manual `VACUUM` in PostgreSQL version 12 and later allows skipping the index cleanup phase, while emergency autovacuum in PostgreSQL version 14 and later does this automatically based on the [https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-VACUUM-FAILSAFE-AGE](https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-VACUUM-FAILSAFE-AGE) parameter.

**Warning**  
Skipping index cleanup can lead to index bloat and negatively impact query performance. To mitigate this, consider reindexing or vacuuming affected indexes during a maintenance window.

For additional guidance on handling large indexes, refer to the documentation on [Managing autovacuum with large indexes](Appendix.PostgreSQL.CommonDBATasks.Autovacuum.LargeIndexes.md).

**Parallel index vacuuming**

Starting with PostgreSQL 13, indexes can be vacuumed and cleaned in parallel by default using manual `VACUUM`, with one vacuum worker process assigned to each index. However, for PostgreSQL to determine if a vacuum operation qualifies for parallel execution, specific criteria must be met:
+ There must be at least two indexes.
+ The `max_parallel_maintenance_workers` parameter should be set to at least 2.
+ The index size must exceed the `min_parallel_index_scan_size` limit, which defaults to 512KB.

You can adjust the `max_parallel_maintenance_workers` setting based on the number of vCPUs available on your Amazon RDS instance and the number of indexes on the table to optimize vacuuming turnaround time.

For more information, see [Parallel vacuuming in Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL](https://aws.amazon.com/blogs/database/parallel-vacuuming-in-amazon-rds-for-postgresql-and-amazon-aurora-postgresql/).

## Too many tables or databases to vacuum


As mentioned in PostgreSQL's [The Autovacuum Daemon](https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM') documentation, the autovacuum daemon operates through multiple processes. This includes a persistent autovacuum launcher responsible for starting autovacuum worker processes for each database within the system. The launcher schedules these workers to initiate approximately every `autovacuum_naptime` seconds per database.

With 'N' databases, a new worker begins roughly every [`autovacuum_naptime`/N seconds]. However, the total number of concurrent workers is limited by the `autovacuum_max_workers` setting. If the number of databases or tables requiring vacuuming exceeds this limit, the next database or table will be processed as soon as a worker becomes available.

When many large tables or databases require vacuuming concurrently, all available autovacuum workers can become occupied for an extended duration, delaying maintenance on other tables and databases. In environments with high transaction rates, this bottleneck can quickly escalate and potentially lead to wraparound vacuum issues within your Amazon RDS instance.

When `postgres_get_av_diag()` detects a high number of tables or databases, it provides the following recommendation:

```
NOTICE: Your database is currently running aggressive vacuum to prevent wraparound and it might be slow.
```

```
NOTICE: The current setting of autovacuum_max_workers:3 might not be sufficient. Consider increasing the setting and, if necessary, consider scaling up the Amazon RDS instance class for more workers.
```

**Guidance**

**Increase autovacuum\$1max\$1workers**

To expedite the vacuuming, we recommend adjusting the `autovacuum_max_workers` parameter to allow more concurrent autovacuum workers. If performance bottlenecks persist, consider scaling up your Amazon RDS instance to a class with more vCPUs, which can further improve the parallel processing capabilities.

## Aggressive vacuum (to prevent wraparound) is running


The age of the database (MaximumUsedTransactionIDs) in PostgreSQL only decreases when an aggressive vacuum (to prevent wraparound) is successfully completed. Until this vacuum finishes, the age will continue to increase depending on the transaction rate.

The `postgres_get_av_diag()` function generates the following `NOTICE` when it detects an aggressive vacuum. However, it only triggers this output after the vacuum has been active for at least two minutes.

```
NOTICE: Your database is currently running aggressive vacuum to prevent wraparound, monitor autovacuum performance.
```

For more information about aggressive vacuum, see [When an aggressive vacuum is already running](Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.NOTICE.md).

You can verify if an aggressive vacuum is in progress with the following query:

```
SELECT
    a.xact_start AS start_time,
    v.datname "database",
    a.query,
    a.wait_event,
    v.pid,
    v.phase,
    v.relid::regclass,
    pg_size_pretty(pg_relation_size(v.relid)) AS heap_size,
    (
        SELECT
            string_agg(pg_size_pretty(pg_relation_size(i.indexrelid)) || ':' || i.indexrelid::regclass || chr(10), ', ')
        FROM
            pg_index i
        WHERE
            i.indrelid = v.relid
    ) AS index_sizes,
    trunc(v.heap_blks_scanned * 100 / NULLIF(v.heap_blks_total, 0)) AS step1_scan_pct,
    v.index_vacuum_count || '/' || (
        SELECT
            count(*)
        FROM
            pg_index i
        WHERE
            i.indrelid = v.relid
    ) AS step2_vacuum_indexes,
    trunc(v.heap_blks_vacuumed * 100 / NULLIF(v.heap_blks_total, 0)) AS step3_vacuum_pct,
    age(CURRENT_TIMESTAMP, a.xact_start) AS total_time_spent_sofar
FROM
    pg_stat_activity a
    INNER JOIN pg_stat_progress_vacuum v ON v.pid = a.pid;
```

You can determine if it's an aggressive vacuum (to prevent wraparound) by checking the query column in the output. The phrase "to prevent wraparound" indicates that it is an aggressive vacuum.

```
query                  | autovacuum: VACUUM public.t3 (to prevent wraparound)
```

For example, suppose you have a blocker at transaction age 1 billion and a table requiring an aggressive vacuum to prevent wraparound at the same transaction age. Additionally, there's another blocker at transaction age 750 million. After clearing the blocker at transaction age 1 billion, the transaction age won't immediately drop to 750 million. It will remain high until the table needing the aggressive vacuum or any transaction with an age over 750 million is completed. During this period, the transaction age of your PostgreSQL cluster will continue to rise. Once the vacuum process is completed, the transaction age will drop to 750 million but will start increasing again until further vacuuming is finished. This cycle will continue as long as these conditions persist, until the transaction age eventually drops to the level configured for your Amazon RDS instance, specified by `autovacuum_freeze_max_age`.

# Explanation of the NOTICE messages in RDS for PostgreSQL
Explanation of the NOTICE messages

 The `postgres_get_av_diag()` function provides the following NOTICE messages:

**When the age has not reached the monitoring threshold yet**  
The monitoring threshold for `postgres_get_av_diag()` to identify blockers is 500 million transactions by default. If `postgres_get_av_diag()` generates the following NOTICE, it indicates that the transaction age has not yet reached this threshold.  

```
NOTICE: postgres_get_av_diag() checks for blockers that prevent aggressive vacuums only, it does so only after exceeding dvb_threshold which is 500,000,000 and age of this PostgreSQL cluster is currently at 2.
```

**Not connected to the database with the age of oldest transaction ID**  
The `postgres_get_av_diag()` function provides the most accurate output when connected to the database with the oldest transaction ID age. The database with the oldest transaction ID age reported by `postgres_get_av_diag()` will be different than “my\$1database” in your case. If you are not connected to the correct database, the following NOTICE is generated:  

```
NOTICE: You are not connected to the database with the age of oldest transaction ID. Connect to my_database database and run postgres_get_av_diag() for accurate reporting.
```
Connecting to the database with the oldest transaction age is important for the following reasons:  
+ **Identifying temporary table blockers:** Because the metadata for temporary tables is specific to each database, they are typically found in the database where they are created. However, if a temporary table happens to be the top blocker and resides in the database with the oldest transaction, this could be misleading. Connecting to the correct database ensures the accurate identification of the temporary table blocker.
+ **Diagnosing slow vacuums:** The index metadata and table count information are database-specific and necessary for diagnosing slow vacuum issues.

**Database with oldest transaction by age is on an rdsadmin or template0 database**  
In certain cases, the `rdsadmin` or `template0` databases may be identified as the database with the oldest transaction ID age. If this happens, `postgres_get_av_diag()` will issue the following NOTICE:  

```
NOTICE: The database with the age of oldest transaction ID is rdsadmin or template0, reach out to support if the reported blocker is in rdsadmin or template0.
```
Verify that the listed blocker is not originating from either of these two databases. If the blocker is reported to be present in either `rdsadmin` or `template0`, contact support as these databases are not user-accessible and require intervention.  
It is highly unlikely for either the `rdsadmin` or `template0` database to contain a top blocker.

**When an aggressive vacuum is already running**  
The `postgres_get_av_diag()` function is designed to report when an aggressive vacuum process is running, but it only triggers this output after the vacuum has been active for at least 1 minute. This intentional delay helps reduce the chances of false positives. By waiting, the function ensures that only effective, significant vacuums are reported, leading to more accurate and reliable monitoring of vacuum activity.  
The `postgres_get_av_diag()` function generates the following NOTICE when it detects one or more aggressive vacuums in progress.   

```
NOTICE: Your database is currently running aggressive vacuum to prevent wraparound, monitor autovacuum performance.
```
As indicated in the NOTICE, continue to monitor the performance of vacuum. For more information about aggressive vacuum see [Aggressive vacuum (to prevent wraparound) is running](Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Resolving_Performance.md#Appendix.PostgreSQL.CommonDBATasks.Autovacuum_Monitoring.Aggressive_vacuum)

**When autovacuum is off**  
The `postgres_get_av_diag()` function generates the following NOTICE if autovacuum is disabled on your database instance:  

```
NOTICE: Autovacuum is OFF, we strongly recommend to enable it, no restart is necessary.
```
Autovacuum is a critical feature of your RDS for PostgreSQL DB instance that ensures smooth database operation. It automatically removes old row versions, reclaims storage space, and prevents table bloat, helping to keep tables and indexes efficient for optimal performance. Additionally, it protects against transaction ID wraparound, which can halt transactions on your Amazon RDS instance. Disabling autovacuum can lead to long-term declines in database performance and stability. We suggest you to keep it on all the times. For more information, see [Understanding autovacuum in RDS for PostgreSQL environments](https://aws.amazon.com/blogs/database/understanding-autovacuum-in-amazon-rds-for-postgresql-environments/).  
Turning off autovacuum doesn't stop aggressive vacuums. These will still occur once your tables hit the `autovacuum_freeze_max_age` threshold. 

**The number of transactions remaining is critically low**  
The `postgres_get_av_diag()` function generates the following NOTICE when a wraparound vacuum is imminent. This NOTICE is issued when your Amazon RDS instance is 100 million transactions away from potentially rejecting new transactions.  

```
WARNING: Number of transactions remaining is critically low, resolve issues with autovacuum or perform manual VACUUM FREEZE before your instance stops accepting transactions.
```
Your immediate action is required to avoid database downtime. You should closely monitor your vacuuming operations and consider manually initiating a `VACUUM FREEZE` on the affected database to prevent transaction failures.