

# Accelerate the Deployment of Secure and Compliant Modern Data Architectures for Advanced Analytics and AI
<a name="solution-overview"></a>

Publication date: *March 2026 (Version 1.5.0). For updates, refer to [CHANGELOG.md](https://github.com/aws/modern-data-architecture-accelerator/blob/main/CHANGELOG.md) file in the MDAA Developer Guide.* 

The Modern Data Architecture Accelerator (MDAA) on AWS helps customers rapidly deploy and manage sophisticated data platform architectures on AWS. This solution provides a flexible framework that can adapt to most common analytics platform architectures, including basic Data Lakes and Data Warehouses, Lake House architectures, complex Data Mesh implementations, and generative AI development environments. The solution helps you establish a modern data foundation with built-in security, governance, and operational capabilities. Through a simplified configuration approach, you can:
+ Deploy generative AI development environments with integrated AWS services
+ Configure and manage AI/ML workloads including generative AI solutions
+ Deploy data environments across multiple domains and AWS accounts
+ Implement AWS reference architectures like Modern Data Architecture (Lake House)
+ Configure and manage data mesh nodes for distributed data platforms
+ Manage data governance controls and security services
+ Define and deploy purpose-built analytics services for specific use cases
+ Deploy machine learning workload environments
+ Customise deployments through infrastructure-as-code using AWS CDK
+ Configure data ingestion patterns, storage layers, and processing capabilities

MDAA is provided as an open-source solution that can be deployed from locally cloned source code or published NPM packages. You pay only for AWS services enabled to set up your data platform and operate your workloads.

 **Key Benefits** 
+ Accelerated Time to Value: Deploy a production-ready modern data platform in weeks instead of months
+ Built-in Data Governance: Implement data security, privacy, and compliance controls from day one
+ Standardised Architecture: Ensure consistent data handling patterns and practices across the organisation
+ Analytics-Ready Infrastructure: Pre-configured analytics services and data processing pipelines, including integrated components for developing and deploying generative AI solutions
+ Cost Optimisation: Built-in cost management and data lifecycle optimisation features

The intended audience for using this solution’s features and capabilities in their environment are data platform engineers, data architects, analytics teams and cloud operations professionals.

Use this navigation table to quickly find answers to these questions:

This implementation guide describes architectural considerations and configuration steps for deploying MDAA. It includes links to AWS CloudFormation templates synthesised from AWS CDK that launch and configure the AWS services required to deploy this solution using AWS best practices for security and availability


| If you want to . . . | Read . . . | 
| --- | --- | 
|  Know the cost for running this solution. The cost will depend on the modules and other custom features you want to deploy.  |   [Cost](cost.md)   | 
|  Know which AWS Regions support this solution.  |   [Supported AWS Regions](plan-your-deployment.md#regional-deployments)   | 
|  Access the source code.  |   [GitHub repository](https://github.com/aws/modern-data-architecture-accelerator)   | 
|  Try a hands-on workshop to get started with MDAA.  |   [MDAA Workshop](https://catalog.us-east-1.prod.workshops.aws/workshops/6e7289c7-5662-494d-8b56-b8706412c3a6/en-US)   | 

# Use cases
<a name="use-cases"></a>
+ Centralised Data Platforms - Deploy and manage centralized data lakes and warehouses from a single account
+ Distributed Data Mesh - Create autonomous data mesh nodes for individual business units while maintaining unified governance
+ Hub and Spoke Architecture - Implement hybrid models with centralized enterprise data assets and distributed business unit autonomy
+ Analytics and ML Platforms - Build platforms supporting analytics, data science, and AI/ML workloads
+ Generative AI Application Development - Deploy secure agentic applications using Bedrock AgentCore Runtime with minimal configuration
+ Custom Data Architectures - Adapt and extend the framework to implement custom data platform architectures

# Concepts and definitions
<a name="concepts-and-definitions"></a>

 **Analytics Data Lake** 

A centralised repository that allows you to store structured and unstructured data at any scale. It enables you to break down data silos and combine different types of analytics to gain insights and guide better business decisions.

 **Data Mesh** 

A decentralised socio-technical approach to data management and analytics that treats data as a product and applies domain-oriented, self-serve design to distribute data ownership and architecture.

 **Data Product** 

A reusable dataset with clear ownership, documentation, and service-level objectives that can be easily discovered and consumed by authorised users across the organisation.

 **Federated Access Control** 

A security mechanism that enables centralised management of user identities and access permissions across multiple systems and domains while maintaining consistent security policies.

 **Data Governance** 

The overall management of data availability, usability, integrity, and security in an enterprise system. It includes policies, procedures, and standards that ensure data is managed consistently and used appropriately.

 **Data Quality** 

The measure of data’s condition and its fitness to serve its intended purpose in a given context. This includes accuracy, completeness, consistency, timeliness, and validity of the data.

 **Self-Service Analytics** 

A form of business intelligence where users can access and analyse data without requiring assistance from IT or data specialists, enabling faster decision-making and reducing bottlenecks.

 **Data Catalog** 

A centralised metadata repository that helps organisations discover, understand, and manage their data assets. It provides a searchable inventory of data assets across the data platform.

 **Data Pipeline** 

A series of automated steps that extract data from various sources, transform it according to business rules, and load it into target systems for analysis and reporting.

 **Data Domain** 

A logical grouping of related data assets and processes managed by a specific business unit or team, typically aligned with organisational functions or business capabilities.

**Note**  
For a general reference of AWS terms, see the [AWS Glossary](https://docs.aws.amazon.com/general/latest/gr/glos-chap.html).

# Solution Structure
<a name="solution-structure"></a>

MDAA is a comprehensive solution built using a modular approach. Think of it as a sophisticated building kit for creating secure and scalable data infrastructure on AWS. Just as a building needs a foundation, walls, and utilities, MDAA provides all the necessary components to build your data/AI platform.

## The Module Concept
<a name="the-module-concept"></a>

Think of modules as specialized building blocks. For example:
+ If you want to deploy raw and transformation buckets for your data lake, there’s a datalake module that creates encrypted S3 buckets with proper access controls, sets up fine-grained lifecycle policies for cost optimization and configures bucket policies and cross-account access if needed
+ If you need to query data using Amazon Athena, there’s a module that sets up Athena workgroups with resource controls, configures query result locations, connects with your datalake and establishes necessary IAM permissions for query execution
+ If you want to add AWS Lake Formation settings to your tables, there’s a module that configures Lake Formation permissions and security settings, sets up database and table-level access controls, etc.

## How Modules Work Together
<a name="how-modules-work-together"></a>

Consider this practical scenario: You want to build a secure data lake for financial data.
+ Start with the roles module to create necessary IAM roles and policies
+ Add the datalake module to create encrypted storage
+ Add the Glue module to catalog your data
+ Implement Lake Formation module for compliance
+ Configure Athena module for analysts to query
+ Add audit modules for security

## MDAA Starter Packages
<a name="mdaa-starter-packages"></a>

### Overview
<a name="overview"></a>

Modern Data Architecture Accelerator (MDAA) provides a comprehensive set of pre-configured starter packages, each designed to accelerate your journey in building enterprise-grade secure and compliant data platforms on AWS. These packages eliminate the complexity of starting from scratch by providing production-ready configurations, security controls, and infrastructure templates.

### Available Starter Packages
<a name="available-starter-packages"></a>

#### 1. Basic Data Lake Package
<a name="1-basic-data-lake-package"></a>


| Purpose | Basic data lake foundation | 
| --- | --- | 
|  Key Features  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/solutions/latest/modern-data-architecture-accelerator/solution-structure.html)  | 
|  Ideal For  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/solutions/latest/modern-data-architecture-accelerator/solution-structure.html)  | 

#### 2. AI/ML Platform Package
<a name="2-aiml-platform-package"></a>


| Purpose | Enterprise-grade machine learning infrastructure | 
| --- | --- | 
|  Key Features  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/solutions/latest/modern-data-architecture-accelerator/solution-structure.html)  | 
|  Best For  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/solutions/latest/modern-data-architecture-accelerator/solution-structure.html)  | 

#### 3. GenAI Accelerator Starter Package
<a name="3-genai-accelerator-starter-package"></a>


| Purpose | Generative AI development and deployment platform | 
| --- | --- | 
|  Key Features  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/solutions/latest/modern-data-architecture-accelerator/solution-structure.html)  | 
|  Best For  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/solutions/latest/modern-data-architecture-accelerator/solution-structure.html)  | 

#### 4. Governed Lakehouse Package
<a name="4-governed-lakehouse-package"></a>


| Purpose | Enterprise lakehouse with comprehensive data governance using DataZone | 
| --- | --- | 
|  Key Features  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/solutions/latest/modern-data-architecture-accelerator/solution-structure.html)  | 
|  Best For  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/solutions/latest/modern-data-architecture-accelerator/solution-structure.html)  | 

### Package Benefits
<a name="package-benefits"></a>

#### Time to Market
<a name="time-to-market"></a>
+ Reduce implementation time by 60-70%
+ Avoid common architectural pitfalls
+ Start with proven configurations

#### Cost Optimization
<a name="cost-optimization"></a>
+ Pre-configured resource optimization
+ Built-in cost control measures
+ Efficient resource utilization patterns

#### Security & Compliance
<a name="security-compliance"></a>
+ Security controls aligned with AWS best practices
+ Built-in compliance frameworks
+ Automated security monitoring

#### Scalability
<a name="scalability"></a>
+ Designed for growth
+ Flexible architecture
+ Easy module addition/removal

### Best Practices
<a name="best-practices"></a>

#### Security
<a name="security"></a>
+ Enable all recommended security features
+ Implement proper encryption
+ Regular security assessments
+ Continuous monitoring

#### Operations
<a name="operations"></a>
+ Follow GitOps practices
+ Implement proper tagging
+ Regular backup testing
+ Disaster recovery planning

#### Cost Management
<a name="cost-management"></a>
+ Enable cost allocation tags
+ Set up budget alerts
+ Regular cost reviews
+ Resource optimization

### Support and Maintenance
<a name="support-and-maintenance"></a>

#### Regular Updates
<a name="regular-updates"></a>
+ Security patches
+ Feature updates
+ Performance improvements
+ Best practice updates

**Note**  
All packages are regularly updated to incorporate the latest AWS features and security best practices.