

# Third-party business data catalog integrations
Third-party catalog integrations

Amazon SageMaker Unified Studio supports metadata synchronization with third-party business data catalog platforms. These integrations keep catalog metadata aligned between Amazon SageMaker Catalog and partner platforms. Teams get a consistent view of their data and AI assets regardless of which tool they use day to day.

With these integrations, you can synchronize key metadata elements such as projects, assets, descriptions, glossary terms, and their hierarchies. Organizations can maintain aligned glossary terms, asset descriptions, and ownership information across platforms without manual reconciliation.

Amazon SageMaker Unified Studio currently integrates with the following third-party catalog platforms:
+ **Atlan** – Bidirectional metadata synchronization between Amazon SageMaker Catalog and Atlan.
+ **Collibra** – Bidirectional metadata synchronization and access request workflow integration between Amazon SageMaker Catalog and Collibra.
+ **Alation** – Metadata extraction from Amazon SageMaker Catalog into Alation.

**Topics**
+ [

# Atlan integration
](atlan-integration.md)
+ [

# Collibra integration
](collibra-integration.md)
+ [

# Alation integration
](alation-integration.md)

# Atlan integration


The integration between Amazon SageMaker Catalog and Atlan enables bidirectional metadata synchronization across both platforms. Atlan is a data workspace that helps business users, analysts, and engineers collaborate on data projects. This integration connects teams working in Atlan with technical teams working in Amazon SageMaker Unified Studio for analytics and machine learning. For detailed setup instructions, see [Unifying governance and metadata across Amazon SageMaker Unified Studio and Atlan](https://aws.amazon.com/blogs/big-data/unifying-governance-and-metadata-across-amazon-sagemaker-unified-studio-and-atlan/).

## Capabilities


The Atlan integration supports the following capabilities:
+ On-demand and scheduled bidirectional metadata synchronization.
+ Synchronization of glossary terms and descriptions, including parent-child relationships.
+ Ingestion of projects, published and subscribed assets, domains, data products, metadata forms, and column descriptions from Amazon SageMaker Catalog into Atlan.
+ Automatic association of glossary terms with related data assets.
+ Real-time reverse sync of metadata updates from Atlan back to Amazon SageMaker Catalog.

## How it works


The integration uses AWS Identity and Access Management roles to establish a secure connection between your AWS account and Atlan. You deploy an AWS CloudFormation template that creates the required IAM role and policies. This role follows the principle of least privilege, granting Atlan access only to the resources required for cataloging and governance.

After you configure the connection, the Atlan connector calls Amazon SageMaker Unified Studio APIs to ingest assets and metadata. The connector transforms ingested assets into Atlan's metadata model, making them discoverable and governable inside Atlan. When users update metadata in Atlan, the real-time reverse sync pipeline detects changes and pushes updates back to Amazon SageMaker Catalog.

You set up this integration by configuring a connection to Amazon SageMaker Unified Studio from within Atlan.

# Collibra integration


The integration between Amazon SageMaker Catalog and Collibra provides bidirectional metadata synchronization and access governance across both platforms. Collibra is a data intelligence platform that helps organizations centralize governance workflows, define business glossaries, and enforce policies across data assets. This integration is available as an open-source solution on [GitHub](https://github.com/aws-samples/amazon-datazone-examples/tree/main/blogs/unifying_metadata_governance_across_amazon_sagemaker_catalog_and_collibra), co-developed by AWS and Collibra. For detailed setup instructions, see [Unifying metadata governance across Amazon SageMaker and Collibra](https://aws.amazon.com/blogs/big-data/unifying-metadata-governance-across-amazon-sagemaker-and-collibra/).

## Capabilities


### Metadata synchronization


The Collibra integration synchronizes the following metadata between Amazon SageMaker Catalog and Collibra:
+ Bidirectional synchronization of glossary terms and descriptions.
+ Preservation of glossary structure, including parent-child relationships.
+ Association of terms with data assets such as datasets, tables, and columns.
+ Synchronization of classifications, data categories, and tags.
+ Alignment of technical descriptions for datasets and columns.

Core metadata elements synchronize every 5 minutes. Subscription requests that originate in Amazon SageMaker Catalog synchronize to Collibra instantly.

### Access request workflows


The Collibra integration extends Collibra's access governance workflows to assets cataloged in Amazon SageMaker Catalog. Users can discover and request access to datasets from within Collibra or Amazon SageMaker Unified Studio using familiar approval processes.

Key capabilities of the access request workflow include:
+ Access request initiation from either Collibra or Amazon SageMaker Unified Studio.
+ Centralized review and approval managed within Collibra by designated business stewards.
+ Automatic access provisioning through the Amazon SageMaker Catalog grant mechanism.
+ Status tracking of subscription requests across both platforms.

## How it works


The integration uses the APIs of both Amazon SageMaker and Collibra Data Governance Center. You deploy an AWS CloudFormation template that provisions the required AWS resources, including IAM roles and AWS Lambda functions. On the Collibra side, you configure operating model changes, import workflows, and assign business stewards to assets.

The solution is available as an open-source project on [GitHub](https://github.com/aws-samples/amazon-datazone-examples/tree/main/blogs/unifying_metadata_governance_across_amazon_sagemaker_catalog_and_collibra).

# Alation integration


The integration between Amazon SageMaker Catalog and Alation synchronizes catalog metadata between both systems. Alation is a data intelligence platform that helps organizations make data discoverable, governed, and actionable. This integration creates a unified metadata experience where technical teams working in Amazon SageMaker Unified Studio and business teams working in Alation collaborate on top of the same metadata. For detailed setup instructions, see [Build a trusted foundation for data and AI using Alation and Amazon SageMaker Unified Studio](https://aws.amazon.com/blogs/big-data/build-a-trusted-foundation-for-data-and-ai-using-alation-and-amazon-sagemaker-unified-studio/).

## Capabilities


The current phase of the Alation integration extracts metadata from Amazon SageMaker Catalog into Alation. The integration synchronizes the following metadata:
+ Domains, projects, and asset names.
+ Descriptions, owners, and glossary terms.
+ Custom metadata fields (metadata forms).
+ Provenance metadata, including the originating service, the actor who made the change, and the timestamp.

You can run metadata extractions on demand or schedule them to run automatically. The system performs an initial bulk extraction and then keeps data current through incremental updates.

## How it works


The integration connects through AWS Identity and Access Management authentication. You can use either an IAM role (recommended) or an IAM user with access keys. The connector uses scoped IAM permissions following least-privilege principles. Communication uses encrypted APIs, and only metadata is synchronized. Your data files and artifacts remain in their original AWS locations.

You set up this integration by installing the SageMaker enhanced connector in Alation and configuring a data source connection.