

# DQDL rule type reference
<a name="dqdl-rule-types"></a>

This section provides a reference for each rule type that AWS Glue Data Quality supports.

**Note**  
DQDL doesn't currently support nested or list-type column data.
Bracketed values in the below table will be replaced with the information provided in rule arguments.
Rules typically require an additional argument for expression.


| Ruletype | Description | Arguments | Reported Metrics | Supported as Rule? | Supported as Analyzer? | Returns row-level Results? | Dynamic rule support? | Generates Observations | Supports Where Clause Syntax? | 
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | 
| AggregateMatch | Checks if two datasets match by comparing summary metrics like total sales amount. Useful for financial institutions to compare if all data is ingested from source systems. | One or more aggregations |  When first and second aggregation column names match: `Column.[Column].AggregateMatch` When first and second aggregation column names different: `Column.[Column1,Column2].AggregateMatch`  | Yes | No | No | No | No | No | 
| AllStatistics | Standalone analyzer to gather multiple metrics for the provided column in a dataset. | A single column name |  For columns of all types: `Dataset.*.RowCount` `Column.[Column].Completeness` `Column.[Column].Uniqueness` Additional metrics for string-valued columns: `ColumnLength metrics` Additional metrics for numeric-valued columns: `ColumnValues metrics`  | No | Yes | No | No | No | No | 
| ColumnCorrelation | Checks how well two columns are correlated. | Exactly two column names | Multicolumn.[Column1,Column2].ColumnCorrelation | Yes | Yes | No | Yes | No | Yes | 
| ColumnCount | Checks if any columns are dropped. | None | Dataset.\$1.ColumnCount | Yes | Yes | No | Yes | Yes | No | 
| ColumnDataType | Checks if a column is compliant with a datatype. | Exactly one column name | Column.[Column].ColumnDataType.Compliance | Yes | No | No | Yes, in row-level threshold expression | No | Yes | 
| ColumnExists | Checks if columns exist in a dataset. This allows customers building self service data platforms to ensure certain columns are made available. | Exactly one column name | N/A | Yes | No | No | No | No | No | 
| ColumnLength | Checks if length of data is consistent. | Exactly one column name |  `Column.[Column].MaximumLength` `Column.[Column].MinimumLength` Additional metric when row-level threshold provided: `Column.[Column].ColumnValues.Compliance`  | Yes | Yes | Yes, when row-level threshold provided | No | Yes. Only generates observations by analyzing Minimum and Maximum length | Yes | 
| ColumnNamesMatchPattern | Checks if column names match defined patterns. Useful for governance teams to enforce column name consistency.  | A regex for column names | Dataset.\$1.ColumnNamesPatternMatchRatio | Yes | No | No | No | No | No | 
| ColumnValues | Checks if data is consistent per defined values. This rule supports regular expressions. | Exactly one column name |  `Column.[Column].Maximum` `Column.[Column].Minimum` Additional metric when row-level threshold provided: `Column.[Column].ColumnValues.Compliance`  | Yes | Yes | Yes, when row-level threshold provided | No | Yes. Only generates observations by analyzing Minimum and Maximum values | Yes | 
| Completeness | Checks for any blank or NULLs in data. | Exactly one column name | `Column.[Column].Completeness` | Yes | Yes | Yes | Yes | Yes | Yes | 
| CustomSql |  Customers can implement almost any type of data quality checks in SQL. |  A SQL statement (Optional) A row-level threshold  |  `Dataset.*.CustomSQL` Additional metric when row-level threshold provided: `Dataset.*.CustomSQL.Compliance`  | Yes | No | Yes, when row-level threshold provided | Yes | No | No | 
| DataFreshness | Checks if data is fresh. | Exactly one column name | Column.[Column].DataFreshness.Compliance | Yes | No | Yes | No | No | Yes | 
| DatasetMatch | Compares two datasets and identifies if they are in synch. |  Name of a reference dataset A column mapping (Optional) Columns to check for matches  | Dataset.[ReferenceDatasetAlias].DatasetMatch | Yes | No | Yes | Yes | No | No | 
| DistinctValuesCount | Checks for duplicate values. | Exactly one column name | Column.[Column].DistinctValuesCount | Yes | Yes | Yes | Yes | Yes | Yes | 
| DetectAnomalies | Checks for anomalies in another rule type's reported metrics. | A rule type | Metric(s) reported by the rule type argument | Yes | No | No | No | No | No | 
| Entropy | Checks for entropy of the data. | Exactly one column name | Column.[Column].Entropy | Yes | Yes | No | Yes | No | Yes | 
| IsComplete | Checks if 100% of the data is complete. | Exactly one column name | Column.[Column].Completeness | Yes | No | Yes | No | No | Yes | 
| IsPrimaryKey | Checks if a column is a primary key (not NULL and unique). | Exactly one column name |  For single column: `Column.[Column].Uniqueness` For multiple columns: `Multicolumn.[CommaDelimitedColumns].Uniqueness`  | Yes | No | Yes | No | No | Yes | 
| IsUnique | Checks if 100% of the data is unique. | Exactly one column name | Column.[Column].Uniqueness | Yes | No | Yes | No | No | Yes | 
| Mean | Checks if the mean matches the set threshold. | Exactly one column name | Column.[Column].Mean | Yes | Yes | Yes | Yes | No | Yes | 
| ReferentialIntegrity | Checks if two datasets have referential integrity. |  One or more column names from dataset One or more column names from reference dataset  | Column.[ReferenceDatasetAlias].ReferentialIntegrity | Yes | No | Yes | Yes | No | No | 
| RowCount | Checks if record counts match a threshold. | None | Dataset.\$1.RowCount | Yes | Yes | No | Yes | Yes | Yes | 
| RowCountMatch | Checks if record counts between two datasets match. | Reference dataset alias | Dataset.[ReferenceDatasetAlias].RowCountMatch | Yes | No | No | Yes | No | No | 
| StandardDeviation | Checks if standard deviation matches the threshold. | Exactly one column name | Column.[Column].StandardDeviation | Yes | Yes | Yes | Yes | No | Yes | 
| SchemaMatch | Checks if schema between two datasets match. | Reference dataset alias | Dataset.[ReferenceDatasetAlias].SchemaMatch | Yes | No | No | Yes | No | No | 
| Sum | Checks if sum matches a set threshold. | Exactly one column name | Column.[Column].Sum | Yes | Yes | No | Yes | No | Yes | 
| Uniqueness | Checks if uniqueness of dataset matches threshold. | Exactly one column name | Column.[Column].Uniqueness | Yes | Yes | Yes | Yes | No | Yes | 
| UniqueValueRatio | Checks if the unique value ration matches threshold. | Exactly one column name | Column.[Column].UniqueValueRatio | Yes | Yes | Yes | Yes | No | Yes | 
| FileFreshness | Checks if files in Amazon S3 are fresh. | File or Folder path and a threshold. |  `Dataset.*.FileFreshness.Compliance` `Dataset.*.FileCount`  | Yes | No | No | No | No | No | 
| FileMatch | Checks if contents of file match to a checksum or with other file. This rule uses checksums to validate if two files are same. | Source File or Folder path and Target file or folder path. | No statistics are generated. | Yes | No | No | No | No | No | 
| FileSize | Checks if the size of a file matches with a specified condition. | File or folder path and threshold. | `Dataset.*.FileSize.Compliance` `Dataset.*.FileCount` `Dataset.*.MaximumFileSize` `Dataset.*.MinimumFileSize`  | Yes | No | No | No | No | No | 
| FileUniqueness | Checks if files are unique using checksums. | File or folder path and threshold. | `Dataset.*.FileUniquenessRatio` `Dataset.*.FileCount`  | Yes | No | No | No | No | No | 

**Topics**
+ [AggregateMatch](dqdl-rule-types-AggregateMatch.md)
+ [ColumnCorrelation](dqdl-rule-types-ColumnCorrelation.md)
+ [ColumnCount](dqdl-rule-types-ColumnCount.md)
+ [ColumnDataType](dqdl-rule-types-ColumnDataType.md)
+ [ColumnExists](dqdl-rule-types-ColumnExists.md)
+ [ColumnLength](dqdl-rule-types-ColumnLength.md)
+ [ColumnNamesMatchPattern](dqdl-rule-types-ColumnNamesMatchPattern.md)
+ [ColumnValues](dqdl-rule-types-ColumnValues.md)
+ [Completeness](dqdl-rule-types-Completeness.md)
+ [CustomSQL](dqdl-rule-types-CustomSql.md)
+ [DataFreshness](dqdl-rule-types-DataFreshness.md)
+ [DatasetMatch](dqdl-rule-types-DatasetMatch.md)
+ [DistinctValuesCount](dqdl-rule-types-DistinctValuesCount.md)
+ [Entropy](dqdl-rule-types-Entropy.md)
+ [IsComplete](dqdl-rule-types-IsComplete.md)
+ [IsPrimaryKey](dqdl-rule-types-IsPrimaryKey.md)
+ [IsUnique](dqdl-rule-types-IsUnique.md)
+ [Mean](dqdl-rule-types-Mean.md)
+ [ReferentialIntegrity](dqdl-rule-types-ReferentialIntegrity.md)
+ [RowCount](dqdl-rule-types-RowCount.md)
+ [RowCountMatch](dqdl-rule-types-RowCountMatch.md)
+ [StandardDeviation](dqdl-rule-types-StandardDeviation.md)
+ [Sum](dqdl-rule-types-Sum.md)
+ [SchemaMatch](dqdl-rule-types-SchemaMatch.md)
+ [Uniqueness](dqdl-rule-types-Uniqueness.md)
+ [UniqueValueRatio](dqdl-rule-types-UniqueValueRatio.md)
+ [DetectAnomalies](dqdl-rule-types-DetectAnomalies.md)
+ [FileFreshness](dqdl-rule-types-FileFreshness.md)
+ [FileMatch](dqdl-rule-types-FileMatch.md)
+ [FileUniqueness](dqdl-rule-types-FileUniqueness.md)
+ [FileSize](dqdl-rule-types-FileSize.md)