Amazon Redshift will no longer support the creation of new Python UDFs starting Patch 198. Existing Python UDFs will continue to function until June 30, 2026. For more information, see the [ blog post ](https://aws.amazon.com/blogs/big-data/amazon-redshift-python-user-defined-functions-will-reach-end-of-support-after-june-30-2026/). 

# WLM query monitoring rules
<a name="cm-c-wlm-query-monitoring-rules"></a>

In Amazon Redshift workload management (WLM), query monitoring rules define metrics-based performance boundaries for WLM queues and specify what action to take when a query goes beyond those boundaries. For example, for a queue dedicated to short running queries, you might create a rule that cancels queries that run for more than 60 seconds. To track poorly designed queries, you might have another rule that logs queries that contain nested loops. 

You define query monitoring rules as part of your workload management (WLM) configuration. You can define up to 25 rules for each queue, with a limit of 25 rules for all queues. Each rule includes up to three conditions, or predicates, and one action. A *predicate* consists of a metric, a comparison condition (=, <, or > ), and a value. If all of the predicates for any rule are met, that rule's action is triggered. Possible rule actions are log, hop, and abort, as discussed following. 

The rules in a given queue apply only to queries running in that queue. A rule is independent of other rules. 

WLM evaluates metrics every 10 seconds. Amazon Redshift applies query monitoring rules at the child query level when queries are automatically rewritten. If more than one rule is triggered during the same period, WLM chooses the rule with the most severe action. If the action for two rules has the same severity, WLM runs the rules in alphabetical order, based on the rule name. If the action is hop or abort, the action is logged and the query is evicted from the queue. If the action is log, the query continues to run in the queue. WLM initiates only one log action per query per rule. If the queue contains other rules, those rules remain in effect. If the action is hop and the query is routed to another queue, the rules for the new queue apply. For more information about query monitoring and tracking actions taken on specific queries, see the collection of samples at [Short query acceleration](wlm-short-query-acceleration.md).

When all of a rule's predicates are met, WLM writes a row to the [STL\_WLM\_RULE\_ACTION](r_STL_WLM_RULE_ACTION.md) system table. In addition, Amazon Redshift records query metrics for currently running queries to [STV\_QUERY\_METRICS](r_STV_QUERY_METRICS.md). Metrics for completed queries are stored in [STL\_QUERY\_METRICS](r_STL_QUERY_METRICS.md). 

**Note**  
For Amazon Redshift Serverless, you can configure query queues and monitoring rules using the `wlm_json_configuration` parameter. This allows you to create multiple queues with different user roles, query groups, and monitoring rules. For more information about configuring serverless query queues, see [Setting query queues](https://docs.aws.amazon.com/redshift/latest/mgmt/serverless-workgroup-query-queues.html) in the *Amazon Redshift Management Guide*.

## Defining a query monitoring rule
<a name="cm-c-wlm-defining-query-monitoring-rules"></a>

You create query monitoring rules as part of your WLM configuration, which you define as part of your cluster's parameter group definition.

You can create rules using the AWS Management Console or programmatically using JSON. 

**Note**  
If you choose to create rules programmatically, we strongly recommend using the console to generate the JSON that you include in the parameter group definition. For more information, see [Creating a query monitoring rule](https://docs.aws.amazon.com/redshift/latest/mgmt/parameter-group-modify-qmr-console.html) and [Configuring Parameter Values Using the AWS CLI](https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-parameter-groups.html#configure-parameters-using-the-cli) in the *Amazon Redshift Management Guide*. 

To define a query monitoring rule, you specify the following elements:
+ A rule name – Rule names must be unique within the WLM configuration. Rule names can be up to 32 alphanumeric characters or underscores, and can't contain spaces or quotation marks. You can have up to 25 rules per queue, and the total limit for all queues is 25 rules.
+ One or more predicates – You can have up to three predicates per rule. If all the predicates for any rule are met, the associated action is triggered. A predicate is defined by a metric name, an operator ( =, <, or > ), and a value. An example is `query_cpu_time > 100000`. For a list of metrics and examples of values for different metrics, see [Query monitoring metrics for Amazon Redshift provisioned](#cm-c-wlm-query-monitoring-metrics) following in this section. 
+ An action – If more than one rule is triggered, WLM chooses the rule with the most severe action. Possible actions, in ascending order of severity, are:
  + Log – Record information about the query in the STL\_WLM\_RULE\_ACTION system table. Use the Log action when you want to only write a log record. WLM creates at most one log per query, per rule. Following a log action, other rules remain in force and WLM continues to monitor the query. 
  + Hop (only available with manual WLM) – Log the action and hop the query to the next matching queue. If there isn't another matching queue, the query is canceled. QMR hops only [CREATE TABLE AS](https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_TABLE_AS.html) (CTAS) statements and read-only queries, such as SELECT statements. For more information, see [WLM query queue hopping](wlm-queue-hopping.md). 
  + Abort – Log the action and cancel the query. QMR doesn't stop COPY statements and maintenance operations, such as ALTER, ANALYZE and VACUUM. 
  + Change priority (only available with automatic WLM) – Change the priority of a query. 

To limit the runtime of queries, we recommend creating a query monitoring rule instead of using WLM timeout. For example, you can set `max_execution_time` to 50,000 milliseconds as shown in the following JSON snippet.

```
"max_execution_time": 50000
```

But we recommend instead that you define an equivalent query monitoring rule. The following example demonstrates a query monitoring rule that sets `query_execution_time` to 50 seconds:

```
"rules": 
[
    {
        "rule_name": "rule_query_execution",
        "predicate": [
            {
                "metric_name": "query_execution_time",
                "operator": ">",
                "value": 50
            }
        ],
        "action": "abort"
    }            
]
```

For steps to create or modify a query monitoring rule, see [Creating a query monitoring rule](https://docs.aws.amazon.com/redshift/latest/mgmt/parameter-group-modify-qmr-console.html) and [Properties in the wlm\_json\_configuration Parameter](https://docs.aws.amazon.com/redshift/latest/mgmt/workload-mgmt-config.html#wlm-json-config-properties) in the *Amazon Redshift Management Guide*.

You can find more information about query monitoring rules in the following topics: 
+  [Query monitoring metrics for Amazon Redshift provisioned](#cm-c-wlm-query-monitoring-metrics) 
+  [Query monitoring rules templates](#cm-c-wlm-query-monitoring-templates) 
+  [Creating a query monitoring rule](https://docs.aws.amazon.com/redshift/latest/mgmt/parameter-group-modify-qmr-console.html) 
+  [Configuring Workload Management](https://docs.aws.amazon.com/redshift/latest/mgmt/workload-mgmt-config.html) 
+  [System tables and views for query monitoring rules](#cm-c-wlm-qmr-tables-and-views) 

## Query monitoring metrics for Amazon Redshift provisioned
<a name="cm-c-wlm-query-monitoring-metrics"></a>

The following table describes the metrics used in query monitoring rules. (These metrics are distinct from the metrics stored in the [STV\_QUERY\_METRICS](r_STV_QUERY_METRICS.md) and [STL\_QUERY\_METRICS](r_STL_QUERY_METRICS.md) system tables.) 

For a given metric, the performance threshold is tracked either at the query level or the segment level. For more information about segments and steps, see [Query planning and execution workflow](c-query-planning.md).

**Note**  
The [WLM timeout](cm-c-defining-query-queues.md#wlm-timeout) parameter is distinct from query monitoring rules.


| Metric | Name | Description | 
| --- | --- | --- | 
| Query CPU time |  query\_cpu\_time  | CPU time used by the query, in seconds. CPU time is distinct from Query execution time. Valid values are 0–999,999.  | 
| Blocks read |  query\_blocks\_read  | Number of 1 MB data blocks read by the query.Valid values are 0–1,048,575.  | 
| Scan row count |  scan\_row\_count  | The number of rows in a scan step. The row count is the total number of rows emitted before filtering rows marked for deletion (ghost rows) and before applying user-defined query filters.<br />Valid values are 0–999,999,999,999,999.  | 
| Query execution time |  query\_execution\_time  | Elapsed execution time for a query, in seconds. Execution time doesn't include time spent waiting in a queue.Valid values are 0–86,399. | 
| Query queue time |  query\_queue\_time  | Time spent waiting in a queue, in seconds. Valid values are 0–86,399.  | 
| CPU usage |  query\_cpu\_usage\_percent  | Percent of CPU capacity used by the query.Valid values are 0–6,399.  | 
| Memory to disk |  query\_temp\_blocks\_to\_disk  | Temporary disk space used to write intermediate results, in 1 MB blocks.Valid values are 0–319,815,679.  | 
| CPU skew |  cpu\_skew  | The ratio of maximum CPU usage for any slice to average CPU usage for all slices. This metric is defined at the segment level.Valid values are 0–99.  | 
| I/O skew |  io\_skew  | The ratio of maximum blocks read (I/O) for any slice to average blocks read for all slices. This metric is defined at the segment level.Valid values are 0–99.  | 
| Rows joined |  join\_row\_count  | The number of rows processed in a join step.Valid values are 0–999,999,999,999,999.  | 
| Nested loop join row count |  nested\_loop\_join\_row\_count  | The number or rows in a nested loop join.Valid values are 0–999,999,999,999,999.  | 
| Return row count |  return\_row\_count  | The number of rows returned by the query. Valid values are 0–999,999,999,999,999.  | 
| Segment execution time |  segment\_execution\_time  | Elapsed execution time for a single segment, in seconds. To avoid or reduce sampling errors, include segment\_execution\_time > 10 in your rules.Valid values are 0–86,388. | 
| Spectrum scan row count |  spectrum\_scan\_row\_count  | The number of rows of data in Amazon S3 scanned by an Amazon Redshift Spectrum query. Valid values are 0–999,999,999,999,999.  | 
| Spectrum scan size |  spectrum\_scan\_size\_mb  | The size of data in Amazon S3, in MB, scanned by an Amazon Redshift Spectrum query.Valid values are 0–999,999,999,999,999.  | 
| Query priority |  query\_priority  | The priority of the query. Valid values are `HIGHEST`, `HIGH`, `NORMAL`, `LOW`, and `LOWEST`. When comparing `query_priority` using greater than (>) and less than (<) operators, `HIGHEST` is greater than `HIGH`, `HIGH` is greater than `NORMAL`, and so on.  | 

**Note**  
The hop action is not supported with the `query_queue_time` predicate. That is, rules defined to hop when a `query_queue_time` predicate is met are ignored. 
Short segment execution times can result in sampling errors with some metrics, such as `io_skew` and `query_cpu_usage_percent`. To avoid or reduce sampling errors, include segment execution time in your rules. A good starting point is `segment_execution_time > 10`.

The [SVL\_QUERY\_METRICS](r_SVL_QUERY_METRICS.md) view shows the metrics for completed queries. The [SVL\_QUERY\_METRICS\_SUMMARY](r_SVL_QUERY_METRICS_SUMMARY.md) view shows the maximum values of metrics for completed queries. Use the values in these views as an aid to determine threshold values for defining query monitoring rules.

## Query monitoring metrics for Amazon Redshift Serverless
<a name="cm-c-wlm-query-monitoring-metrics-serverless"></a>

The following table describes the metrics used in query monitoring rules for Amazon Redshift Serverless. 


| Metric | WLM Predicate Name | Name | Description | 
| --- | --- | --- | --- | 
| Blocks read |  query\_blocks\_read  |  max\_query\_blocks\_read  | Number of 1 MB data blocks read by the query.Valid values are 0–1,048,575.  | 
| Scan row count |  scan\_row\_count  |  max\_scan\_row\_count  | The number of rows in a scan step. The row count is the total number of rows emitted before filtering rows marked for deletion (ghost rows) and before applying user-defined query filters.<br />Valid values are 0–999,999,999,999,999.  | 
| Query execution time |  query\_execution\_time  | max\_query\_execution\_time | Elapsed execution time for a query, in seconds. Execution time doesn't include time spent waiting in a queue. If a query exceeds the set execution time, Amazon Redshift Serverless stops the query.<br />Valid values are 0–86,399.  | 
| Query queue time |  query\_queue\_time  | max\_query\_queue\_time | Time spent waiting in a queue, in seconds. Valid values are 0–86,399.  | 
| Memory to disk |  query\_temp\_blocks\_to\_disk  |  max\_query\_temp\_blocks\_to\_disk  | Temporary disk space used to write intermediate results, in 1 MB blocks.Valid values are 0–319,815,679.  | 
| Rows joined |  join\_row\_count  |  max\_join\_row\_count  | The number of rows processed in a join step.Valid values are 0–999,999,999,999,999.  | 
| Nested loop join row count |  nested\_loop\_join\_row\_count  |  max\_nested\_loop\_join\_row\_count  | The number or rows in a nested loop join.Valid values are 0–999,999,999,999,999.  | 

**Note**  
The hop action is not supported with the `max_query_queue_time` predicate. That is, rules defined to hop when a `max_query_queue_time` predicate is met are ignored. 
Short segment execution times can result in sampling errors with some metrics, such as `max_io_skew` and `max_query_cpu_usage_percent`.

For Amazon Redshift Serverless, you can configure query queues and monitoring rules using the `wlm_json_configuration` parameter. This allows you to create multiple queues with different user roles, query groups, and monitoring rules using the metrics listed above. For more information about configuring serverless query queues, see [WLM JSON configuration structure](https://docs.aws.amazon.com/redshift/latest/mgmt/serverless-workgroup-query-queues.html#serverless-wlm-json-configuration) in the *Amazon Redshift Management Guide*.

## Query monitoring rules templates
<a name="cm-c-wlm-query-monitoring-templates"></a>

When you add a rule using the Amazon Redshift console, you can choose to create a rule from a predefined template. Amazon Redshift creates a new rule with a set of predicates and populates the predicates with default values. The default action is log. You can modify the predicates and action to meet your use case. 

The following table lists available templates. 


| Template Name | Predicates | Description | 
| --- | --- | --- | 
| Nested loop join |  nested\_loop\_join\_row\_count > 100  | A nested loop join might indicate an incomplete join predicate, which often results in a very large return set (a Cartesian product). Use a low row count to find a potentially runaway query early. | 
| Query returns a high number of rows |  return\_row\_count > 1000000  | If you dedicate a queue to simple, short running queries, you might include a rule that finds queries returning a high row count. The template uses a default of 1 million rows. For some systems, you might consider one million rows to be high, or in a larger system, a billion or more rows might be high. | 
| Join with a high number of rows |  join\_row\_count > 1000000000  | A join step that involves an unusually high number of rows might indicate a need for more restrictive filters. The template uses a default of 1 billion rows. For an ad hoc (one-time) queue that's intended for quick, simple queries, you might use a lower number.  | 
| High disk usage when writing intermediate results |  query\_temp\_blocks\_to\_disk > 100000  | When currently executing queries use more than the available system RAM, the query execution engine writes intermediate results to disk (spilled memory). Typically, this condition is the result of a rogue query, which usually is also the query that uses the most disk space. The acceptable threshold for disk usage varies based on the cluster node type and number of nodes. The template uses a default of 100,000 blocks, or 100 GB. For a small cluster, you might use a lower number.  | 
| Long running query with high I/O skew | segment\_execution\_time > 120 and io\_skew > 1.30  | I/O skew occurs when one node slice has a much higher I/O rate than the other slices. As a starting point, a skew of 1.30 (1.3 times average) is considered high. High I/O skew is not always a problem, but when combined with a long running query time, it might indicate a problem with the distribution style or sort key.  | 

## System tables and views for query monitoring rules
<a name="cm-c-wlm-qmr-tables-and-views"></a>

When all of a rule's predicates are met, WLM writes a row to the [STL\_WLM\_RULE\_ACTION](r_STL_WLM_RULE_ACTION.md) system table. This row contains details for the query that triggered the rule and the resulting action. 

In addition, Amazon Redshift records query metrics the following system tables and views.
+ The [STV\_QUERY\_METRICS](r_STV_QUERY_METRICS.md) table displays the metrics for currently running queries.
+ The [STL\_QUERY\_METRICS](r_STL_QUERY_METRICS.md) table records the metrics for completed queries. 
+ The [SVL\_QUERY\_METRICS](r_SVL_QUERY_METRICS.md) view shows the metrics for completed queries. 
+ The [SVL\_QUERY\_METRICS\_SUMMARY](r_SVL_QUERY_METRICS_SUMMARY.md) view shows the maximum values of metrics for completed queries.