

# DynamoDB zero-ETL integration with Amazon SageMaker Lakehouse
<a name="amazon-sagemaker-lakehouse-for-DynamoDB"></a>

DynamoDB zero-ETL integration with Amazon SageMaker Lakehouse eliminates the need to build custom data movement pipelines by automatically replicating DynamoDB data to Amazon SageMaker Lakehouse. This no-code integration helps customers run analytics workloads on their DynamoDB data using Amazon SageMaker Lakehouse without consuming any DynamoDB table capacity. The integration automatically exports data from your table and keeps the target fresh, typically within 15 to 30 minutes.

**Topics**
+ [DynamoDB zero-ETL integration with Amazon SageMaker Lakehouse](amazon-sagemaker-lakehouse-for-DynamoDB-zero-etl.md)

# DynamoDB zero-ETL integration with Amazon SageMaker Lakehouse
<a name="amazon-sagemaker-lakehouse-for-DynamoDB-zero-etl"></a>

Setting up an integration between the DynamoDB table and Amazon SageMaker Lakehouse require prerequisites such as configuring IAM roles which AWS Glue uses to access data from the source and write to the target, and the use of KMS keys to encrypt the data in intermediate or the target location.

**Topics**
+ [Prerequisites before creating a DynamoDB zero-ETL integration with Amazon SageMaker Lakehouse](#amazon-sagemaker-lakehouse-for-DynamoDB-zero-etl-prereqs)
+ [Creating a DynamoDB zero-ETL integration with Amazon SageMaker Lakehouse](amazon-sagemaker-lakehouse-for-DynamoDB-zero-etl-getting-started.md)
+ [Viewing CloudWatch metrics for integration](#amazon-sagemaker-lakehouse-for-DynamoDB-zero-etl-cloudwatch-metrics)

## Prerequisites before creating a DynamoDB zero-ETL integration with Amazon SageMaker Lakehouse
<a name="amazon-sagemaker-lakehouse-for-DynamoDB-zero-etl-prereqs"></a>

To configure a zero-ETL integration with an DynamoDB source, you need to set up a Resource-Based Access (RBAC) policy that allows AWS Glue to access and export data from the DynamoDB table. The policy should include specific permissions like `ExportTableToPointInTime`, `DescribeTable`, and `DescribeExport` with conditions restricting access to a specific AWS account and region. See, [Configuring an Amazon DynamoDB source](https://docs.aws.amazon.com/glue/latest/dg/zero-etl-sources.html#zero-etl-config-source-dynamodb) for more information.

Point-in-time recovery (PITR) must be enabled for the table, and you can apply the policy using AWS CLI commands. The policy can be further refined by specifying the full integration ARN for more restrictive access control. For more information, see [Prerequisites for setting up a zero-ETL integration](https://docs.aws.amazon.com/glue/latest/dg/zero-etl-prerequisites.html).

# Creating a DynamoDB zero-ETL integration with Amazon SageMaker Lakehouse
<a name="amazon-sagemaker-lakehouse-for-DynamoDB-zero-etl-getting-started"></a>

After completing integration prerequisites, you can create, modify, or delete the zero-ETL integration following the guidance below:

## Creating an integration
<a name="amazon-sagemaker-lakehouse-for-DynamoDB-zero-etl-getting-started-creating"></a>

**To create an integration**

1. Sign in to the AWS Management Console and open the Amazon DynamoDB console at [https://console.aws.amazon.com/dynamodbv2](https://console.aws.amazon.com/dynamodbv2).

1. In the navigation pane, choose **Integrations**. 

1. Select **Create zero-ETL integration with Amazon SageMaker Lakehouse**, and then choose **Next**.

1. To create an integration, see [Creating an integration](https://docs.aws.amazon.com/glue/latest/dg/zero-etl-common-integration-tasks.html#zero-etl-creating).

1. To modify an integration, see [Modifying an integration](https://docs.aws.amazon.com/glue/latest/dg/zero-etl-common-integration-tasks.html#zero-etl-modifying).

1. To delete an integration, see [Deleting an integration](https://docs.aws.amazon.com/glue/latest/dg/zero-etl-common-integration-tasks.html#zero-etl-deleting).

1. To set up a cross-account integration, see [Setting up cross-account integration](https://docs.aws.amazon.com/glue/latest/dg/zero-etl-prerequisites.html#zero-etl-setup-cross-account-integration).

## Enabling compaction on target Amazon S3 tables
<a name="amazon-sagemaker-lakehouse-for-DynamoDB-zero-etl-enabling-compaction"></a>

You can enable compaction to improve query performance in Amazon Athena.

First, complete the prerequisite setup for compaction resources, including configuring the necessary IAM role. Refer to the Lake Formation documentation for detailed IAM role configuration steps. See, [Optimizing tables for compaction](https://docs.aws.amazon.com/lake-formation/latest/dg/data-compaction.html).

To enable compaction on the AWS Glue table created during integration, follow the Lake Formation compaction enabling process. This will help optimize your table's performance and query efficiency.

## Viewing CloudWatch metrics for integration
<a name="amazon-sagemaker-lakehouse-for-DynamoDB-zero-etl-cloudwatch-metrics"></a>

Once an integration completes, you can see these CloudWatch metrics and EventBridge notifications generated in your account for each AWS Glue job. For more information, see [Monitoring an integration](https://docs.aws.amazon.com/glue/latest/dg/zero-etl-monitoring.html).