# Oracle and PostgreSQL performance tuning
<a name="chap-oracle-aurora-pg.tuning"></a>

This section provides pages related to Oracle and PostgreSQL performance tuning topics.

**Topics**
+ [Oracle database hints and PostgreSQL DB query planning](chap-oracle-aurora-pg.tuning.hints.md)
+ [Oracle and PostgreSQL run plans](chap-oracle-aurora-pg.tuning.plans.md)
+ [Oracle and PostgreSQL table statistics](chap-oracle-aurora-pg.tuning.statistics.md)

# Oracle database hints and PostgreSQL DB query planning
<a name="chap-oracle-aurora-pg.tuning.hints"></a>

With AWS DMS, you can optimize query performance by using Oracle database hints and PostgreSQL query planning techniques. Oracle database hints provide instructions to the optimizer on how to execute a SQL statement, while PostgreSQL query planning involves analyzing the execution plan to identify and address performance bottlenecks.


| Feature compatibility |  AWS SCT / AWS DMS automation level |  AWS SCT action code index | Key differences | 
| --- | --- | --- | --- | 
|   ![\[Two star feature compatibility\]](http://docs.aws.amazon.com/dms/latest/oracle-to-aurora-postgresql-migration-playbook/images/pb-compatibility-2.png)   |  N/A  |  N/A  |  Very limited set of hints in PostgreSQL. Index hints and optimizer hints as comments. Syntax differences.  | 

## Oracle usage
<a name="chap-oracle-aurora-pg.tuning.hints.ora"></a>

Oracle provides users with the ability to influence how the query optimizer behaves and the decisions made to generate query run plans. Controlling the behavior of the database optimizer is performed using database hints. They can be defined as a directive operation to the optimizer and alter the decisions of how run plans are generated.

Oracle supports over 60 different database hints, and each database hint can have 0 or more arguments. Database hints are divided into different categories such as optimizer hints, join order hints, and parallel execution hints.

**Note**  
Database hints are embedded directly into the SQL queries immediately following the `SELECT` keyword using the format `/* <DB_HINT> */`.

 **Examples** 

Force the Query Optimizer to use a specific index for data access.

```
SELECT /* INDEX(EMP, IDX_EMP_HIRE_DATE)*/ *
  FROM EMPLOYEES EMP
  WHERE HIRE_DATE >= '01-JAN-2010';

Run Plan
Plan hash value: 3035503638
Id  Operation                    Name           Rows  Bytes  Cost (%CPU)  Time
0   SELECT STATEMENT                            1     62     2 (0)        00:00:01
1   TABLE ACCESS BY INDEX ROWID  EMPLOYEES      1     62     2 (0)        00:00:01
2   INDEX RANGE SCAN             IDX_HIRE_DATE  1            1 (0)        00:00:01

Predicate Information (identified by operation id):
2 - access("HIRE_DATE">=TO_DATE(' 2010-01-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
```

For more information, see [Comments](https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/Comments.html#GUID-D316D545-89E2-4D54-977F-FC97815CD62El) and [Influencing the Optimizer](https://docs.oracle.com/en/database/oracle/oracle-database/19/tgsql/influencing-the-optimizer.html#GUID-8758EF88-1CC6-41BD-8581-246702414D1D) in the *Oracle documentation*.

## PostgreSQL usage
<a name="chap-oracle-aurora-pg.tuning.hints.pg"></a>

PostgreSQL doesn’t support database hints to influence the behavior of the query planner and we can’t influence how execution plans are generated from within SQL queries. Although database hints are not directly supported, session parameters (also known as Query Planning Parameters) can influence the behavior of the query optimizer at a session level.

 **Examples** 

Set the query planner to use indexes instead of full table scans (disable `SEQSCAN`).

```
SET ENABLE_SEQSCAN=FALSE;
```

Set the query planner’s estimated cost of a disk page fetch that is part of a series of sequential fetches (`SEQ_PAGE_COST`) and set the planner’s estimate of the cost of a non-sequentially-fetched disk page (`RANDOM_PAGE_COST`). Reducing the value of `RANDOM_PAGE_COST` relative to `SEQ_PAGE_COST` will cause the query planner to prefer index scans, while raising the value will make index scans more expensive.

```
SET SEQ_PAGE_COST to 4;
SET RANDOM_PAGE_COST to 1;
```

Turn on or turn off the query planner’s use of nested-loops when performing joins. While it is impossible to completely disable the usage of nested-loop joins, setting the `ENABLE_NESTLOOP` to an `OFF` value discourages the query planner from choosing nested-loop joins compared to alternative join methods.

```
SET ENABLE_NESTLOOP to FALSE;
```

For more information, see [Query Planning](https://www.postgresql.org/docs/13/runtime-config-query.html) in the *PostgreSQL documentation*.

# Oracle and PostgreSQL run plans
<a name="chap-oracle-aurora-pg.tuning.plans"></a>

With AWS DMS, you can analyze and optimize database query performance by examining Oracle and PostgreSQL run plans. A run plan is the sequence of operations that the database engine performs to execute a SQL statement.


| Feature compatibility |  AWS SCT / AWS DMS automation level |  AWS SCT action code index | Key differences | 
| --- | --- | --- | --- | 
|   ![\[Two star feature compatibility\]](http://docs.aws.amazon.com/dms/latest/oracle-to-aurora-postgresql-migration-playbook/images/pb-compatibility-2.png)   |  N/A  |  N/A  |  Syntax differences. Completely different optimizer with different operators and rules in PostgreSQL.  | 

## Oracle usage
<a name="chap-oracle-aurora-pg.tuning.plans.ora"></a>

Run plans represent the choices made by the query optimizer for accessing database data. The query optimizer generates run plans for `SELECT`, `INSERT`, `UPDATE` and `DELETE` statements. Users and database administrators can view run plans for specific queries and DML operations.

Run plans are especially useful for performance tuning of queries. For example, determining if new indexes should be created. Run plans can be affected by data volumes, data statistics, and instance parameters (global or session parameters).

Run plans are displayed as a structured tree with the following information:
+ Tables access by the SQL statement and the referenced order for each table.
+ Access method for each table in the statement (full table scan vs. index access).
+ Algorithms used for join operations between tables (hash vs. nested loop joins).
+ Operations performed on retrieved data as such as filtering, sorting, and aggregations.
+ Information about rows being processed (cardinality) and the cost for each operation.
+ Table partitions being accessed.
+ Information about parallel runs.

Oracle 19 introduces SQL Quarantine: now queries that consume resources excessively can be automatically quarantined and prevented from being executed. These queries run plans are also quarantined.

 **Examples** 

Review the potential run plan for a query using the `EXPLAIN PLAN` statement.

```
SET AUTOTRACE TRACEONLY EXPLAIN
SELECT EMPLOYEE_ID, LAST_NAME, FIRST_NAME FROM EMPLOYEES
WHERE LAST_NAME='King' AND FIRST_NAME='Steven';

Run Plan
Plan hash value: 2077747057
Id  Operation                    Name         Rows  Bytes  Cost (%CPU)  Time
0   SELECT STATEMENT                          1     16     2 (0)        00:00:01
1   TABLE ACCESS BY INDEX ROWID  EMPLOYEES    1     16     2 (0)        00:00:01
2   INDEX RANGE SCAN             EMP_NAME_IX  1            1 (0)        00:00:01

Predicate Information (identified by operation id):
2 - access("LAST_NAME"='King' AND "FIRST_NAME"='Steven')
```

 `SET AUTOTRACE TRACEONLY EXPLAIN` instructs SQL\$1PLUS to show the run plan without actually running the query itself.

The `EMPLOYEES` table contains indexes for both the `LAST_NAME` and `FIRST_NAME` columns. Step 2 of the run plan indicates the optimizer is performing an `INDEX RANGE SCAN` in order to retrieve the filtered employee name.

View a different run plan displaying a `FULL TABLE SCAN`.

```
SET AUTOTRACE TRACEONLY EXPLAIN
SELECT EMPLOYEE_ID, LAST_NAME, FIRST_NAME FROM EMPLOYEES
WHERE SALARY > 10000;

Run Plan
Plan hash value: 1445457117
Id  Operation          Name         Rows  Bytes  Cost (%CPU)  Time
0   SELECT STATEMENT                72    1368   3 (0)        00:00:01
1   TABLE ACCESS FULL  EMPLOYEES    72    1368   3 (0)        00:00:01

Predicate Information (identified by operation id):
1 - filter("SALARY">10000)
```

For more information, see [Explaining and Displaying Execution Plans](https://docs.oracle.com/en/database/oracle/oracle-database/19/tgsql/generating-and-displaying-execution-plans.html#GUID-60E30B1C-342B-4D71-B154-C26623D6A3B1) in the *Oracle documentation*.

## PostgreSQL usage
<a name="chap-oracle-aurora-pg.tuning.plans.pg"></a>

The PostgreSQL equivalent to Oracle `EXPLAIN PLAN` is the `EXPLAIN` keyword. The `EXPLAIN` keyword is used to display the run plan for a supplied SQL statement.

Similar to Oracle, the query planner in PostgreSQL will generate the estimated run plan for actions such as: `SELECT`, `INSERT`, `UPDATE` and `DELETE`. It builds a structured tree of plan nodes representing the different actions taken (the sign `→` represents a root line in the PostgreSQL run plan).

In addition, the `EXPLAIN` statement provides statistical information regarding each action such as: cost, rows, time and loops.

When you use the `EXPLAIN` command as part of a SQL statement, the statement will not run, and the run plan will be an estimation. By using the `EXPLAIN ANALYZE` command, the statement will run in addition to displaying the run plan.

### PostgreSQL EXPLAIN synopsis
<a name="chap-oracle-aurora-pg.tuning.plans.pg.synopsis"></a>

```
EXPLAIN [ ( option value[, ...] ) ] statement
EXPLAIN [ ANALYZE ] [ VERBOSE ] statement

where option and values can be one of:

  ANALYZE [ boolean ]
  VERBOSE [ boolean ]
  COSTS [ boolean ]
  BUFFERS [ boolean ]
  TIMING [ boolean ]
  SUMMARY [ boolean ] (since PostgreSQL 10)
  FORMAT { TEXT | XML | JSON | YAML }
```

By default, planning and running time are displayed when you use `EXPLAIN ANALYZE`, but not in other cases. A new `SUMMARY` option provides explicit control of this information. Use `SUMMARY` to include planning and run time metrics in your output.

PostgreSQL provides configurations options that will cancel SQL statements running longer than provided time limit. To use this option, you can set the `statement_timeout` instance-level parameter.

If the value is specified without units, it is taken as milliseconds. A value of zero (the default) disables the timeout.

Third-party connection pooler solutions such as `Pgbouncer` and `PgPool` build on that and allow more flexibility in controlling how long connection to DB can run, be in idle state and so on.

### Aurora PostgreSQL Query Plan Management
<a name="chap-oracle-aurora-pg.tuning.plans.pg.Management"></a>

The Aurora PostgreSQL Query Plan Management (QPM) feature solves the problem of plan instability by allowing database users to maintain stable, yet optimal, performance for a set of managed SQL statements. QPM primarily serves two main objectives:
+  **Plan stability**. QPM prevents plan regression and improves plan stability when any of the above changes occur in the system.
+  **Plan adaptability**. QPM automatically detects new minimum-cost plans and controls when new plans may be used and adapts to the changes.

The quality and consistency of query optimization have a major impact on the performance and stability of any relational database management system (RDBMS). Query optimizers create a query execution plan for a SQL statement at a specific point in time. After conditions change, the optimizer might pick a different plan that makes performance better or worse.

In some cases, a number of changes can all cause the query optimizer to choose a different plan and lead to performance regression. These changes include changes in statistics, constraints, environment settings, query parameter bindings, and software upgrades. Regression is a major concern for high-performance applications.

With query plan management, you can control execution plans for a set of statements that you want to manage.

You can do the following:
+ Improve plan stability by forcing the optimizer to choose from a small number of known, good plans.
+ Optimize plans centrally and then distribute the best plans globally.
+ Identify indexes that aren’t used and assess the impact of creating or dropping an index.
+ Automatically detect a new minimum-cost plan discovered by the optimizer.
+ Try new optimizer features with less risk, because you can choose to approve only the plan changes that improve performance.

 **Examples** 

View the run plan of a SQL statement using the `EXPLAIN` command.

```
EXPLAIN
  SELECT EMPLOYEE_ID, LAST_NAME, FIRST_NAME FROM EMPLOYEES
  WHERE LAST_NAME='King' AND FIRST_NAME='Steven';

Index Scan using idx_emp_name on employees (cost=0.14..8.16 rows=1 width=18)
Index Cond: (((last_name)::text = 'King'::text) AND ((first_name)::text = 'Steven'::text))
(2 rows)
```

Run the same statement with the `ANALYZE` keyword.

```
EXPLAIN ANALYZE
  SELECT EMPLOYEE_ID, LAST_NAME, FIRST_NAME FROM EMPLOYEES
  WHERE LAST_NAME='King' AND FIRST_NAME='Steven';

Seq Scan on employees (cost=0.00..3.60 rows=1 width=18) (actual time=0.012..0.024 rows=1 loops=1)
Filter: (((last_name)::text = 'King'::text) AND ((first_name)::text = 'Steven'::text))
Rows Removed by Filter: 106
Planning time: 0.073 ms
Execution time: 0.037 ms
(5 rows)
```

By adding the ANALYZE keyword and executing the statement, we get additional information in addition to the execution plan.

View a PostgreSQL run plan showing a `FULL TABLE SCAN`.

```
EXPLAIN ANALYZE
  SELECT EMPLOYEE_ID, LAST_NAME, FIRST_NAME FROM EMPLOYEES
  WHERE SALARY > 10000;

Seq Scan on employees (cost=0.00..3.34 rows=15 width=18) (actual time=0.012..0.036 rows=15 loops=1)
Filter: (salary > '10000'::numeric)
Rows Removed by Filter: 92
Planning time: 0.069 ms
Execution time: 0.052 ms
(5 rows)
```

PostgreSQL can perform several scan types for processing and retrieving data from tables including sequential scans, index scans, and bitmap index scans. The sequential scan (`Seq Scan`) is PostgreSQL equivalent for Oracle `Table access full` (full table scan).

For more information, see [EXPLAIN](https://www.postgresql.org/docs/13/sql-explain.html) in the *PostgreSQL documentation*.

# Oracle and PostgreSQL table statistics
<a name="chap-oracle-aurora-pg.tuning.statistics"></a>

With AWS DMS, you can analyze table statistics for your Oracle and PostgreSQL databases to optimize query performance and storage utilization. Table statistics provide information about the data distribution and storage characteristics of database tables, including row counts, data sizes, and index usage.


| Feature compatibility |  AWS SCT / AWS DMS automation level |  AWS SCT action code index | Key differences | 
| --- | --- | --- | --- | 
|   ![\[Three star feature compatibility\]](http://docs.aws.amazon.com/dms/latest/oracle-to-aurora-postgresql-migration-playbook/images/pb-compatibility-3.png)   |  N/A  |  N/A  |  Syntax and option differences, similar functionality  | 

## Oracle usage
<a name="chap-oracle-aurora-pg.tuning.statistics.ora"></a>

Table statistics are one of the important aspects affecting SQL query performance. They enable the query optimizer to make informed assumptions when deciding how to generate the execution plan for each query. Oracle provides the DBMS\$1STATS package to manage and control the table statistics, which can be collected automatically or manually.

The following statistics are usually collected on database tables and indexes:
+ Number of table rows.
+ Number of table blocks.
+ Number of distinct values or nulls.
+ Data distribution histograms.

### Automatic optimizer statistics collection
<a name="chap-oracle-aurora-pg.tuning.statistics.ora.auto"></a>

By default, Oracle collects table and index statistics during predefined maintenance windows using the database scheduler and automated maintenance tasks. The automatic statistics collection mechanism uses Oracle’s data modification monitoring feature that tracks the approximate number of `INSERT`, `UPDATE`, and `DELETE` statements to determine which table statistics should be collected.

Oracle 19 now allows to gather real-time statistics on tables during regular `UPDATE`, `INSERT`, and `DELETE` operations, which ensures that statistics are always up-to-date and are not going stale.

Oracle 19 also introduces High-Frequency Automatic Optimizer Statistics Collection with an ability to set up automatic task that will collect statistics for stale objects.

### Manual optimizer statistics collection
<a name="chap-oracle-aurora-pg.tuning.statistics.ora.manual"></a>

When the automatic statistics collection is not suitable for a particular use case, the optimizer statistics collection can be performed manually at several levels.


| Statistics level | Description | 
| --- | --- | 
|  GATHER\$1INDEX\$1STATS  |  Index statistics.  | 
|  GATHER\$1TABLE\$1STATS  |  Table, column, and index statistics.  | 
|  GATHER\$1SCHEMA\$1STATS  |  Statistics for all objects in a schema.  | 
|  GATHER\$1DICTIONARY\$1STATS  |  Statistics for all dictionary objects.  | 
|  GATHER\$1DATABASE\$1STATS  |  Statistics for all objects in a database.  | 

 **Examples** 

Collect statistics at the table level from the `HR` schema and the `EMPLOYEES` table.

```
BEGIN
DBMS_STATS.GATHER_TABLE_STATS('HR','EMPLOYEES');
END;
/

PL/SQL procedure successfully completed.
```

Collect statistics at a specific column level from the `HR` schema, the `EMPLOYEES` table, and the `DEPARTMENT_ID` column.

```
BEGIN
DBMS_STATS.GATHER_TABLE_STATS('HR','EMPLOYEES',
METHOD_OPT=>'FOR COLUMNS department_id');
END;
/

PL/SQL procedure successfully completed.
```

For more information, see [Optimizer Statistics Concepts](https://docs.oracle.com/en/database/oracle/oracle-database/19/tgsql/optimizer-statistics-concepts.html#GUID-C0E74ACE-2706-48A1-97A2-33F52207166A) in the *Oracle documentation*.

## PostgreSQL usage
<a name="chap-oracle-aurora-pg.tuning.statistics.pg"></a>

Use the `ANALYZE` command to collect statistics about a database, a table or a specific table column. The PostgreSQL `ANALYZE` command collects table statistics which support generation of efficient query execution plans by the query planner.
+  **Histograms** — `ANALYZE` will collect statistics on table columns values and create a histogram of the approximate data distribution in each column.
+  **Pages and rows** — `ANALYZE` will collect statistics on the number of database pages and rows from which each table is comprised.
+  **Data sampling** — For large tables, the `ANALYZE` command will take random samples of values rather than examining each and every single row. This allows the `ANALYZE` command to scan very large tables in a relatively small amount of time.
+  **Statistic collection granularity** — Running the `ANALYZE` command without any parameter will instruct PostgreSQL to examine every table in the current schema. Supplying the table name or column name to the `ANALYZE`, will instruct the database to examine a specific table or table column.

### PostgreSQL automatic statistics collection
<a name="chap-oracle-aurora-pg.tuning.statistics.pg.auto"></a>

By default, PostgreSQL is configured with an autovacuum daemon, which automates the execution of statistics collection using the `ANALYZE` commands (in addition to automation of the `VACUUM` command). The autovacuum daemon scans for tables which show signs of large modifications in data to collect the current statistics. Autovacuum is controlled by several parameters.

Individual tables have several storage parameters which can trigger autovacuum process sooner or later. These parameters, such as `autovacuum_enabled`, `autovacuum_vacuum_threshold`, and others can be set or changed using `CREATE TABLE` or `ALTER TABLE` statements.

```
ALTER TABLE custom_autovaccum
  SET (autovacuum_enabled = true,
    autovacuum_vacuum_cost_delay = 10ms,
    autovacuum_vacuum_scale_factor = 0.01,
    autovacuum_analyze_scale_factor = 0.005);
```

The preceding command enables `autovaccum` for the `custom_autovaccum` table and will specify the `autovacuum` process to sleep for 10 milliseconds each run.

It also specifies a 1% of the table size to be added to `autovacuum_vacuum_threshold` and 0.5% of the table size to be added to `autovacuum_analyze_threshold` when deciding whether to trigger a VACUUM.

For more information, see [Automatic Vacuuming](https://www.postgresql.org/docs/13/runtime-config-autovacuum.html) in the *PostgreSQL documentation*.

### PostgreSQL manual statistics collection
<a name="chap-oracle-aurora-pg.tuning.statistics.pg.manual"></a>

PostgreSQL allows collecting statistics on-demand using the `ANALYZE` command at a database level, table level or column level.
+  `ANALYZE` on indexes is not currently supported.
+  `ANALYZE` requires only a read-lock on the target table, so it can run in parallel with other activity on the table.
+ For large tables, `ANALYZE` takes a random sample of the table contents. Configured via the show `default_statistics_target` parameter. The default value is 100 entries. Raising the limit might allow more accurate planner estimates to be made at the price of consuming more space in the `pg_statistic` table.

 **Examples** 

Gather statistics for the entire database.

```
ANALYZE;
```

Gather statistics for a specific table. The `VERBOSE` keyword displays progress.

```
ANALYZE VERBOSE EMPLOYEES;
```

Gather statistics for a specific column.

```
ANALYZE EMPLOYEES (HIRE_DATE);
```

Specify the `default_statistics_target` parameter for an individual table column and reset it back to default.

```
ALTER TABLE EMPLOYEES ALTER COLUMN SALARY SET STATISTICS 150;

ALTER TABLE EMPLOYEES ALTER COLUMN SALARY SET STATISTICS -1;
```

Larger values increase the time needed to complete an `ANALYZE`, but improve the quality of the collected planner’s statistics which can potentially lead to better execution plans.

View the current (session / global) `default_statistics_target`, modify it to 150 and analyze the `EMPLOYEES` table.

```
SHOW default_statistics_target ;
SET default_statistics_target to 150;
ANALYZE EMPLOYEES;
```

View the last time statistics were collected for a table.

```
SELECT relname, last_analyze FROM pg_stat_all_tables;
```

## Summary
<a name="chap-oracle-aurora-pg.tuning.statistics.summary"></a>


| Feature | Oracle | PostgreSQL | 
| --- | --- | --- | 
|  Analyze a specific database table  |  <pre>BEGIN dbms_stats.gather_table_stats(<br />  ownname =>'hr',<br />  tabname => 'employees' , …<br />);<br />END;</pre>  |  <pre>ANALYZE EMPLOYEES;</pre>  | 
|  Analyze a database table while only sampling certain rows  |  Configure using percentage of table rows to sample. <pre>BEGIN dbms_stats.gather_tabke_stats(<br />  ownname=>'HR', …<br />  ESTIMATE_PERCENT=>100);<br />END;</pre>  |  Configure using the number of entries for the table. <pre>SET default_statistics_target to 150;<br />ANALYZE EMPLOYEES;</pre>  | 
|  Collect statistics for a schema  |  <pre>BEGIN<br />EXECUTE DBMS_STATS.GATHER_SCHEMA_STATS(<br />  ownname => 'HR');<br />END</pre>  |  <pre>ANALYZE;</pre>  | 
|  View last time statistics were collected  |  <pre>select owner, table_name, last_analyzed;</pre>  |  <pre>select relname, last_analyze<br />  from pg_stat_all_tables;</pre>  | 

For more information, see [ANALYZE](https://www.postgresql.org/docs/13/sql-analyze.html) and [The Autovacuum Daemon](https://www.postgresql.org/docs/13/routine-vacuuming.html#AUTOVACUUM) in the *PostgreSQL documentation*.