

# Catalog-level table optimizers
<a name="catalog-level-optimizers"></a>

With a one-time catalog configuration, you can set up automatic optimizers such as compaction, snapshot retention, and orphan file deletion for all new and updated Apache Iceberg tables in the AWS Glue Data Catalog. Catalog-level optimizer configurations allow you to apply consistent optimizer settings across all tables within a catalog, eliminating the need to configure optimizers individually for each table.

Data lake administrators can configure the table optimizers by selecting the default catalog in the Lake Formation console and enabling optimizers using the `Table optimization` option. When you create new tables or update existing tables in the Data Catalog, the Data Catalog automatically runs the table optimizations to reduce operational burden.

If you have configured optimization at the table level or if you have previously deleted the table optimization settings for a table, those table-specific settings take precedence over the default catalog settings for table optimization. If a configuration parameter is not defined at either the table or catalog level, the Iceberg table property value will be applied. This setting is applicable to snapshot retention and orphan file deletion optimizer.

When enabling catalog-level optimizers, consider the following:
+ When you configure optimization settings at the time of catalog creation and subsequently disable the optimizations through an Update Catalog request, the operation will cascade through all the tables within the catalog.
+ If you have already configured optimizers for a given table, then the disable operation at the catalog level will not impact this table.
+ When you disable optimizers at the catalog level, tables with existing optimizer configurations will maintain their specific settings and remain unaffected by the catalog-level change. However, tables without their own optimizer configurations will inherit the disabled state from the catalog level.
+ Since snapshot retention and orphan file deletion optimizers can be schedule-based, updates will introduce a random delay to the start of their schedule. This will cause each optimizer to start at slightly different times, spreading out the load and reducing the likelihood of exceeding service limits.
+ Catalog-level optimizer settings are not automatically inherited by tables when AWS Glue Data Catalog encryption is enabled. If your catalog has metadata encryption enabled, you must configure table optimizers individually for each table. To use catalog-level optimizer inheritance, metadata encryption must be disabled on the catalog.

**Topics**
+ [Enabling catalog-level automatic table optimization](enable-auto-table-optimizers.md)
+ [Viewing catalog-level optimizations](view-catalog-optimizations.md)
+ [Disabling catalog-level table optimization](disable-auto-table-optimizers.md)

# Enabling catalog-level automatic table optimization
<a name="enable-auto-table-optimizers"></a>

 You can enable the automatic table optimization for all new Apache Iceberg tables in the Data Catalog. After creating the table, you can also explicitly update the table optimization settings manually. 

 To update the Data Catalog settings to enable catalog-level table optimizations, the IAM role used must have the `glue:UpdateCatalog` permission on the root catalog. You can use `GetCatalog` API to verify the catalog properties. 

 For the Lake Formation managed tables, the IAM role selected during the catalog optimization configuration requires Lake Formation `ALTER`, `DESCRIBE`, `INSERT`, and `DELETE` permissions for any new tables or updated tables. 

## To enable catalog-level optimizers (console)
<a name="enable-catalog-optimizers-console"></a>

1. Open the Lake Formation console at [https://console.aws.amazon.com/lakeformation/](https://console.aws.amazon.com/lakeformation/).

1. In the navigation pane, choose **Data Catalog**.

1. Select the **Catalogs** tab.

1. Choose the account-level catalog.

1. Choose **Table optimizations**, **Edit** under **Table optimizations** tab. You can also choose **Edit optimizations** from **Actions**.  
![\[The screenshot shows the edit option to enable optimizations at the catalog-level.\]](http://docs.aws.amazon.com/glue/latest/dg/images/catalog-edit-optimizations.png)

1. On the **Table optimization** page, configure the following options:  
![\[The screenshot shows the optimization options at the catalog-level.\]](http://docs.aws.amazon.com/glue/latest/dg/images/catalog-optimization-options.png)

   1. Configure **Compaction** settings:
      + Enable/disable compaction.
      + Choose the IAM role that has the necessary permissions to run the optimizers.

        For more information on the permission requirements for the IAM role, see [Table optimization prerequisites](optimization-prerequisites.md).

   1. Configure **Snapshot retention** settings:
      + Enable/disable retention.
      + Set snapshot retention period in days - default is 5 days.
      + Set number of snapshots to retain - default is 1 snapshot.
      + Enable/disable cleaning of expired files.

   1. Configure **Orphan file deletion** settings:
      + Enable/disable orphan file deletion.
      + Set orphan file retention period in days - default is 3 days.

1. Choose **Save**.

## Enabling Catalog-Level Optimizers via AWS CLI
<a name="catalog-auto-optimizers-cli"></a>

Use the following CLI command to update an existing catalog with optimizer settings:

**Example Update catalog with optimizer settings**  

```
aws glue update-catalog \
   --name catalog-id \
  --catalog-input \
  '{
    "CatalogId": "111122223333",
    "CatalogInput": {
        "CatalogProperties": {
            "CustomProperties": {
                "ColumnStatistics.Enabled": "false",
                "ColumnStatistics.RoleArn": "arn:aws:iam::111122223333:role/service-role/stats-role-name"
            },
            "IcebergOptimizationProperties": {
                "RoleArn": "arn:aws:iam::111122223333:role/optimizer-role-name",
                "Compaction": {
                    "enabled": "true"
                },
                "Retention": {
                    "enabled": "true",
                    "snapshotRetentionPeriodInDays": "10",
                    "numberOfSnapshotsToRetain": "5",
                    "cleanExpiredFiles": "true"
                },
                "OrphanFileDeletion": {
                    "enabled": "true",
                    "orphanFileRetentionPeriodInDays": "3"
                }
            }
        }
    }
}'
```

If you encounter issues with catalog-level optimizers, check the following:
+ Ensure the IAM role has the correct permissions as outlined in the Prerequisites section.
+ Check CloudWatch logs for any error messages related to optimizer operations.

   For more information, see [View available metrics](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/viewing_metrics_with_cloudwatch.html) in the *Amazon CloudWatch User Guide*. 
+ Verify that the catalog settings were successfully applied by checking the catalog configuration.
+ For table access failures, check the CloudWatch logs and EventBridge notifications for detailed error information.

# Viewing catalog-level optimizations
<a name="view-catalog-optimizations"></a>

 When catalog-level table optimization is enabled, anytime an Apache Iceberg table is created or updated via the `CreateTable` or `UpdateTable` APIs through AWS Management Console, SDK, or AWS Glue crawler, an equivalent table level setting is created for that table. 

 After you create or update a table, you can verify the table details to confirm the table optimization. The `Table optimization` shows the `Configuration source` property set as `Catalog`. 

![\[An image of an Apache Iceberg table with catalog-level optimization configuration has  been applied.\]](http://docs.aws.amazon.com/glue/latest/dg/images/catalog-optimization-enabled.png)


# Disabling catalog-level table optimization
<a name="disable-auto-table-optimizers"></a>

 You can disable table optimization for new tables using the AWS Lake Formation console, the `glue:UpdateCatalog` API. 

**To disable the table optimizations at the catalog level**

1. Open the Lake Formation console at [https://console.aws.amazon.com/lakeformation/](https://console.aws.amazon.com/lakeformation/).

1. On the left navigation bar, choose **Catalogs**.

1. On the **Catalog summary** page, choose **Edit** under **Table optimizations**.

1. On the **Edit optimization** page, unselect the **Optimization options**.

1. Choose **Save**.