Overview

This Guidance demonstrates how to use Amazon SageMaker Unified Studio to create a unified development experience for building, deploying, executing, and monitoring end-to-end workflows across AWS data, analytics, and AI/ML services. By showcasing the capabilities of SageMaker Unified Studio, the Guidance helps you streamline your data operations, from ingestion to product deployment. It also illustrates how this integrated approach can enhance efficiency, reduce complexity, and provide comprehensive control over diverse AWS services, ultimately simplifying the management of a complex data workflow.

How it works

Overview

This architecture diagram shows how Amazon SageMaker provides a unified, collaborative experience for ML and data engineers, data stewards, and generative AI developers to accelerate data applications, from exploration to production.

Download the architecture diagram Overview

Step 1

A Data Scientist creates Project A – DEV in Amazon SageMaker Unified Studio. This action publishes a CreateProject event captured by AWS CloudTrail and bridged to a custom Amazon EventBridge bus in the AI Shared Services account. Using the specified Git connection, a Git repository is created.

Step 2

The CreateProject event is delivered to the Shared Services Amazon EventBridge bus, which serves as the integration hub for automation workflows across accounts.

Step 3a

Amazon EventBridge triggers the Project Git Setup AWS Lambda function, which creates the dedicated project Git repository to host build and deployment assets.

Step 3b

An AWS Step Functions project Git setup workflow provisions the corresponding use case templates (such as Classical Regression, LLM fine-tuning, RAG). These templates set up the repo's build and deploy folders with standardized pipeline code. Build and deploy can be kept in separate repositories, with mappings defined in cicd-config.json.

Step 4

The Amazon SageMaker Catalog in the Governance Account serves as a centralized data registry, enabling secure discovery and access to approved data assets across accounts. It provides enterprise-wide management of data and model assets, with integrated controls for data access and sharing across different environments. The Data Scientist subscribes to required datasets (e.g., AWS Glue tables) through Amazon SageMaker Catalog, which is an enterprise-wide business catalog within Amazon SageMaker. Once the data producer approves, the assets become available in the Amazon SageMaker Catalog for use in experiments. The Amazon SageMaker Catalog integrates with AWS Glue tables for structured data and supports Amazon Simple Storage Service (Amazon S3) Object collections for unstructured data.

Step 5

The Data Scientist customizes the project's build pipeline code such as adding MLflow experiment tracking and commits changes into the Git repository.

Step 6

A CI/CD pipeline is automatically triggered based on path filter rules when changes are detected in the repository's build folder.

Step 7

The CI/CD pipeline runs the Amazon SageMaker AI pipeline in the Project A DEV account to build, train, and evaluate the model. Metrics are tracked in MLflow, and upon successful evaluation, the model is automatically registered in the Amazon SageMaker Model Registry within DEV.

Step 8

The Data Scientist approves the model in the Dev Model Registry (stage = Dev, status = Approved). This approval emits an Amazon EventBridge event in the Shared Services account, where an AWS Lambda function copies the model artifacts and metadata into the Central Model Registry for broader visibility and governance.

Step 9

An AI Engineer validates and defines the deployment parameters (instance type, scaling, and endpoint configuration) by updating the project's deploy repo.

Step 10

The AI Engineer promotes the model in the Central Model Registry to stage = Test with status = Approve, preparing it for test deployment.

Step 11

This Test approval triggers the CI/CD pipeline for test deployment, initiating the automated build and deployment process.

Step 12

Step 13

The Governance Officer reviews the integration test results. Upon meeting compliance and performance requirements, the model is approved in the Central Registry (stage = Prod, status = Approved) for production deployment.

Step 14

The production approval triggers the Prod CI/CD pipeline, which builds the required artifacts and deploys the model into the Project A PROD account.

Step 15

The model endpoint and optionally Amazon SageMaker AI pipeline is deployed in the production environment with A/B testing capability through Amazon SageMaker AI endpoints. Model performance is monitored, and results are logged into the Central Amazon SageMaker AI Model Registry, enabling continuous monitoring and governance through Amazon SageMaker AI's monitoring capabilities.

Step 16

OIDC JWT provides secure, short-lived credentials for GitHub Actions workflows without storing secrets. Cross-account access is managed through AWS Identity and Access Management (AWS IAM) roles with least-privilege permissions. The Amazon SageMaker Catalog implements fine-grained access controls to ensure data scientists can only access authorized datasets. Amazon EventBridge rules in each account are configured to forward specific events to the central Project Event Bus using resource-based policies, enabling the automated workflow across organizational boundaries.

Step 17

The architecture implements continuous monitoring and governance across the multi-account setup through integrated services. Model performance is monitored through Amazon SageMaker endpoints, with results continuously logged into the Central Model Registry. The solution leverages Amazon CloudWatch through AWS CloudFormation stack for comprehensive logging across Dev, Test, and Prod environments, enabling end-to-end visibility of model performance throughout the automated deployment pipeline.

Generative AI Lakehouse

This architecture diagram shows how Amazon SageMaker Unified Studio enables a collaborative data engineering and analytics experience for sales forecasting using a Lakehouse architecture, web-based studio with generative AI, and orchestration tools in a unified portal.

Download the architecture diagram Generative AI Lakehouse

Step 4

Data Scientist subscribes to required datasets (for example, AWS Glue tables) through Amazon SageMaker Catalog. Once the data producer approves, the assets become available in the project catalog for use in experiments.

Step 5

Data Scientist customizes the project's build pipeline code such as adding MLflow experiment tracking and commits changes into the Git repository.

Step 6

A CI/CD pipeline is automatically triggered based on path filter rules when changes are detected in the repository's build folder.

Step 7

Step 8

Data Scientist approves the model in the Dev Model Registry (stage = Dev, status = Approved). This approval emits an Amazon EventBridge event in the Shared Services account, where an AWS Lambda function copies the model artifacts and metadata into the Central Model Registry for broader visibility and governance.

Collaborative Model Deployment

This architecture diagram shows how Amazon SageMaker empowers ML engineers to collaboratively develop, evaluate, and deploy sales forecasting models using Amazon SageMaker, SageMaker JumpStart, and SageMaker Workflows within a unified portal.

Download the architecture diagram Collaborative Model Deployment

Step 4

The Amazon SageMaker Catalog in the Governance Account connects to both the LoB Test Account and LoB Prod Account. This integration enables access to required datasets through the Project Catalog in both test and production Unified Studio Project A environments.

Step 12

The CI/CD pipeline deploys the model endpoint and optionally the Amazon SageMaker AI pipeline into the Project A TEST account. Comprehensive integration tests are performed, and the results are recorded back into the Central Model Registry. The Amazon SageMaker AI Pipeline in the TEST environment executes pipeline experiments, model evaluation, and model registration, utilizing the Amazon SageMaker AI Inference EndPoint.

Step 15

The model endpoint and optionally SageMaker pipeline is deployed in the production environment with A/B testing capability. Model performance is monitored and results are logged into the Central Model Registry, enabling continuous monitoring and governance. The Amazon SageMaker AI Pipeline in the PROD environment manages pipeline experiments, model evaluation, and model registration, interacting with the Amazon SageMaker AI Inference EndPoint.

ML Model Development and Registration Pipeline (Part A)

This architecture diagram illustrates the Dev pipeline for a multi-account AIOps framework.

Download the architecture diagram ML Model Development and Registration Pipeline (Part A)

Step 1

The Administrator configures the Amazon SageMaker Unified Studio environment by setting up domains, AWS infrastructure, authentication, GitHub connections, and project template repositories. This provides the foundation for consistent, governed ML project creation.

Step 2

The Use Case Template Repository is prepared with standard templates that define configurations for model build and deployment, ensuring every new project begins with an approved baseline.

Step 3

When a Data Scientist creates a new project in Amazon SageMaker Unified Studio, a Create Project event is emitted. This event is captured by Amazon EventBridge and triggers an AWS Lambda function to automate setup.

Step 4

The AWS Step Functions Project Setup provisions dedicated repositories for model build and model deploy. These repositories are configured and prepopulated with seed code and CI/CD workflows, including GitHub Action secrets.

Step 5

The project repositories are linked to the Shared Services CI/CD system, enabling automated build and deployment pipelines to be centrally managed and consistently applied.

Step 6

During development, the Amazon SageMaker AI Pipeline runs within the project, orchestrating steps for data preprocessing, feature engineering, training, evaluation, and registration. Outputs are stored in the project's Amazon S3 bucket.

Step 7

All experiments are tracked using Amazon SageMaker AI MLflow integration, which logs metrics, artifacts, and experiment details for full traceability.

Step 8

When training and evaluation succeed, the model is registered in the Amazon SageMaker AI Model Registry, awaiting further validation and approval.

Step 9

The registered model undergoes review. Once approved, it is marked ready for deployment, ensuring only validated models progress to the next phase.

Step 10

An Amazon EventBridge event is emitted upon approval, which invokes a Deploy Lambda function to initiate the deployment process.

Step 11

The Project Model Deploy repository runs its GitHub Actions workflow to fetch the approved model, validate configurations, and provision/update the Amazon SageMaker AI Endpoint.

Step 12

The model is deployed as an Amazon SageMaker AI Endpoint, completing the automated journey from development to production. The endpoint is live and ready to serve inference requests.

ML Model Deployment Pipeline (Part B and C)

This architecture diagram illustrates the Test and Prod pipelines for a multi-account AIOps framework.

Download the architecture diagram ML Model Deployment Pipeline (Part B and C)

Step 5

The project repositories are linked to the Shared Services CI/CD system, which manages the build and deployment pipelines through GitHub Actions.

Step 6

During development, the Amazon SageMaker AI Pipeline runs within the project, executing a series of connected steps: Data Pre-processing, Model Training, Model Evaluation, and Model Registration. These steps are initiated from the Studio Notebook and integrate with the project's Amazon S3 bucket.

Step 7

All experiments are tracked using Amazon SageMaker AI MLflow integration through the MLflow Tracking Server, which logs metrics, artifacts, and experiment details for full traceability. This connects directly with the Amazon SageMaker AI Pipeline workflow.

Step 8

When training and evaluation succeed, the model is registered in the Amazon SageMaker AI Model Registry through the Model Registration step in the pipeline, awaiting further validation and approval.

Step 9

The registered model undergoes review in the Amazon SageMaker AI Model Registry. Once approved, it is marked ready for deployment, ensuring only validated models progress to the next phase.

Step 10

An Amazon EventBridge event is emitted upon approval, which invokes a Deploy AWS Lambda function to trigger the deployment process through the GitHub Actions workflow.

Step 11

The Project Model Deploy repository runs its GitHub Actions workflow, following specific steps: Git checkout, AWS credentials setup (OIDC JWT), prerequisite preparation, and SMUS Pipeline updates to handle the approved model.

Step 12

The model is deployed as an Amazon SageMaker AI Endpoint through the final step of the deployment pipeline, completing the automated journey from development to production. The endpoint becomes live and ready to serve inference requests.

Automated Project Provisioning (Technical Implementation)

This architecture demonstrates an automated AIOps workflow within Amazon SageMaker Unified Studio, orchestrating the AIlifecycle from project initiation to model deployment through integrated CI/CD pipelines.

Download the architecture diagram Automated Project Provisioning (Technical Implementation)

Step 1

Step 2

The CreateProject event is delivered to the Shared Services Amazon EventBridge bus, which serves as the integration hub for automation workflows across accounts.

Step 3a

Amazon EventBridge triggers the Project Git Setup AWS Lambda function, which creates the dedicated project Git repository to host build and deployment assets.

Step 3b

Step 4

Step 5

The Data Scientist customizes the project's build pipeline code such as adding MLflow experiment tracking and commits changes into the Git repository.

Step 6

A CI/CD pipeline is automatically triggered based on path filter rules when changes are detected in the repository's build folder.

Step 7

Step 8

Step 9

An AI Engineer validates and defines the deployment parameters (instance type, scaling, and endpoint configuration) by updating the project's deploy repo.

Step 10

The AI Engineer promotes the model in the Central Model Registry to stage = Test with status = Approve, preparing it for test deployment.

Step 11

This Test approval triggers the CI/CD pipeline for test deployment, initiating the automated build and deployment process.

Step 12

Step 13

Step 14

The production approval triggers the Prod CI/CD pipeline, which builds the required artifacts and deploys the model into the Project A PROD account.

Step 15

Step 16

Step 17

Model Approval and Deployment Pipeline (Technical Implementation)

Download the architecture diagram Model Approval and Deployment Pipeline (Technical Implementation)

Step 1

Step 2

The CreateProject event is delivered to the Shared Services Amazon EventBridge bus, which serves as the integration hub for automation workflows across accounts.

Step 3a

Amazon EventBridge triggers the Project Git Setup AWS Lambda function, which creates the dedicated project Git repository to host build and deployment assets.

Step 3b

Step 4

Step 5

The Data Scientist customizes the project's build pipeline code such as adding MLflow experiment tracking and commits changes into the Git repository.

Step 6

A CI/CD pipeline is automatically triggered based on path filter rules when changes are detected in the repository's build folder.

Step 7

Step 8

Step 9

An AI Engineer validates and defines the deployment parameters (instance type, scaling, and endpoint configuration) by updating the project's deploy repo.

Step 10

The AI Engineer promotes the model in the Central Model Registry to stage = Test with status = Approve, preparing it for test deployment.

Step 11

This Test approval triggers the CI/CD pipeline for test deployment, initiating the automated build and deployment process.

Step 12

Step 13

Step 14

The production approval triggers the Prod CI/CD pipeline, which builds the required artifacts and deploys the model into the Project A PROD account.

Step 15

Step 16

Step 17

ML Pipeline Execution and Model Deployment (Technical Implementation)

Download the architecture diagram ML Pipeline Execution and Model Deployment (Technical Implementation)

Step 1

Step 2

The CreateProject event is delivered to the Shared Services Amazon EventBridge bus, which serves as the integration hub for automation workflows across accounts.

Step 3a

Amazon EventBridge triggers the Project Git Setup AWS Lambda function, which creates the dedicated project Git repository to host build and deployment assets.

Step 3b

Step 4

Step 5

The Data Scientist customizes the project's build pipeline code such as adding MLflow experiment tracking and commits changes into the Git repository.

Step 6

A CI/CD pipeline is automatically triggered based on path filter rules when changes are detected in the repository's build folder.

Step 7

Step 8

Step 9

An AI Engineer validates and defines the deployment parameters (instance type, scaling, and endpoint configuration) by updating the project's deploy repo.

Step 10

The AI Engineer promotes the model in the Central Model Registry to stage = Test with status = Approve, preparing it for test deployment.

Step 11

This Test approval triggers the CI/CD pipeline for test deployment, initiating the automated build and deployment process.

Step 12

Step 13

Step 14

The production approval triggers the Prod CI/CD pipeline, which builds the required artifacts and deploys the model into the Project A PROD account.

Step 15

Step 16

Step 17

Deploy with confidence

Everything you need to launch this Guidance in your account is right here.

Let's make it happen

Ready to deploy? Review the sample code on GitHub for detailed deployment instructions to deploy as-is or customize to fit your needs. Go to sample code:

SageMaker Unified Studio Project AI Ops with SageMaker Unified Studio Project

Well-Architected Pillars

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

Operational Excellence

SageMaker Unified Studio integrates team collaboration, Git, analytics services, and AI/ML services to provide a unified data development experience. This creates a centralized operational control plane for collaborating on and executing end-to-end data ingestion, preparation, and deployment of data products. By enabling collaboration and offering a unified developer experience, SageMaker Unified Studio helps you design for operations, allowing full automation of data service integration and deployment.

Read the Operational Excellence whitepaper

Security

SageMaker Unified Studio delivers an SSO experience through deployed web domains that can be federated to IdPs such as IAM Identity Center. You can implement access control policies for users and groups, so that projects, data, and models are accessible with least-privileged permissions. By using SageMaker Unified Studio domains with federated IdP, you can create logical separation of control, defining permission guardrails for your organization. This enables lifecycle-based access management through continuous monitoring and fine-tuning of access controls.

Read the Security whitepaper

Reliability

SageMaker Unified Studio unifies data ingestion, storage, and analytics services, including Amazon S3 and Amazon Redshift to establish a reliable control plane for your data operations. You can leverage these underlying services and tools to create fault-tolerance at the service level through a unified web experience. The SageMaker Unified Studio interface simplifies the orchestration of data and analytics services, allowing easier monitoring and control of data workloads. This reduces the complexity of coordinating and governing individual services, making it more straightforward to detect failures and recover within a single web interface.

Read the Reliability whitepaper

Performance Efficiency

Amazon Q Developer uses generative AI to provide code recommendations, reducing the complexity and effort of development. SageMaker offers access to pre-trained models and simplifies the process of training, validating, and deploying models for your specific use cases. By using these tools, you can accelerate development and implement code recommendations and model deployment without having to manage complex underlying AI/ML technologies.

Read the Performance Efficiency whitepaper

Cost Optimization

SageMaker Unified Studio assists in selecting the right resources for your data workloads by unifying the end-to-end development process. It enables quick deployment and decommissioning of data and analytics services, helping control the costs associated with data product development. By reducing the complexity of development and deployment, SageMaker Unified Studio helps you manage services more effectively. This leads to reduced data transfer costs, improved workload performance analysis, and dynamic resource allocation.

Read the Cost Optimization whitepaper

Sustainability

The managed services underlying SageMaker Unified Studio offer on-demand scaling in addition to data access and lifecycle control. This easier access and control of your data facilitates continuous monitoring of usage, helping reduce the impact of data operations and create more efficient workloads. As a result, you can better predict and control usage, scaling demand without overprovisioning resources for future needs.

Read the Sustainability whitepaper

Read usage guidelines