Guidance for Industrial Data Fabric on AWS

Overview

This Guidance demonstrates how to overcome challenges in collecting and analyzing industrial data by implementing a comprehensive Industrial Data Fabric framework that integrates edge devices with AWS cloud services. The framework enables secure data streaming from on-premise equipment to the cloud through AWS IoT SiteWise and partner solutions while supporting local processing with AWS Outposts for manufacturing-specialized applications. Organizations can access powerful analytics capabilities through Amazon SageMaker Unified Studio, where teams can collaborate on projects, catalog data with AWS Glue, and generate insights using services like Amazon Athena and Amazon EMR. You can accelerate your industrial digital transformation by seamlessly connecting operational and information technology systems while gaining actionable insights that improve efficiency and drive innovation across your manufacturing operations.

Benefits

Accelerate factory-to-cloud integration

Connect industrial equipment to AWS cloud services using purpose-built edge solutions and partner integrations. Reduce implementation time from months to weeks while maintaining secure data transmission from your manufacturing floor to analytics services.

Unify manufacturing data analysis

Consolidate data from disparate factory systems into a cohesive analytics environment with Amazon SageMaker Unified Studio. Enable cross-functional teams to collaborate on a single data copy while maintaining governance and security controls.

Transform operations with AI insights

Deploy machine learning models at the edge for real-time defect detection and process optimization using AWS IoT Greengrass. Enhance decision-making across manufacturing, engineering, and supply chain functions with integrated AI/ML capabilities.

How it works

Overview

This architecture diagram illustrates how to effectively support Smart Manufacturing use cases on AWS. It shows the key components and their interactions, providing an overview of the architecture's structure and functionality.

Download the architecture diagram Overview Step 1
Identify information related to industrial activities from on-premise equipment.
Step 2
Collect real-time data from edge devices and transmit data streams securely to AWS IoT SiteWise in the cloud. Leveraging partners like Litmus, Domatica's EasyEdge, Siemens, and Belden accelerate integration with AWS IoT SiteWise Edge.
Step 3
Connect edge devices through the Shop Floor Connectivity Framework using industrial protocols. Stream data securely to AWS Cloud services. Deploy cloud-developed machine learning models at the edge using AWS IoT Greengrass for defect detection and anomaly inference.
Step 4
Leverage Siemens Industrial Edge, a centrally managed solution to connect data from assets and IT systems to AWS IoT SiteWise, AWS IoT Core, and Amazon Simple Storage Service (Amazon S3). Deploy ML models and industrial applications.
Step 5
Connect edge devices and IT systems using AWS partner solutions like HighByte. Contextualize and stream data to AWS storage and analytics services including AWS IoT SiteWise, Amazon S3, Amazon S3 Tables, AWS TimeStream for Influx DB, Amazon Redshift, Amazon RDS, AWS IoT Core, Amazon Kinesis Data Streams, Amazon Kinesis Data Firehose, and Amazon Managed Service for Apache Flink (Amazon MSK).
Step 6
Connect on-premises applications to Amazon S3 through AWS Storage Gateway using NFS and SMB file shares.
Step 7
Extend AWS infrastructure and services to premises with AWS Outposts, a fully managed service. Run manufacturing-specialized applications on AWS services locally at the plant and integrate with AWS cloud infrastructure.
Step 8
Ingest diverse data through AWS services. Stream real-time data via AWS IoT SiteWise, AWS IoT Core, Amazon Kinesis, and Amazon MSK. Transfer structured data using AWS Database Migration Service (AWS DMS) and AWS Glue. Process unstructured and semi-structured data with Amazon S3. Extract, create, and update ERP data with Amazon AppFlow.
Step 9
Access AWS Analytics and AI/ML services through Amazon SageMaker Unified Studio. Find and query data and AI assets across the organization. Collaborate on projects to build analytics and AI artifacts. Share data, models, and generative AI applications securely. Amazon SageMaker Unified Studio is integrated with Amazon Q Developer, which provides AI-powered assistance for code development, data analysis, and ML workflows.
Step 10
Process OT and IT data through AWS analytics services via Amazon SageMaker Unified Studio Portal. Catalog data with AWS Glue Catalog, transform through Visual ETL, and analyze using Amazon Athena. Process streaming data with Amazon Managed Service for Apache Flink and generate insights using Amazon EMR.
Step 11
Unify data across Amazon S3 data lakes, including S3 Tables, and Amazon Redshift data warehouses with Amazon SageMaker Lakehouse. Build analytics and AI/ML applications on a single data copy using Apache Iceberg-compatible tools and engines.
Step 12
Organize assets, users, and projects within Amazon SageMaker Unified Studio domains. Create multiple domains to match enterprise structure. Collaborate in projects to manage data assets, analyze data, develop ML models, and build generative AI applications.
Step 13
Enrich technical catalog metadata with business context using Amazon SageMaker Catalog. Discover and access approved data and models through generative AI semantic search. Monitor data quality, track lineage, and enforce access policies in Amazon SageMaker Unified Studio.
Step 14
Build, train, and deploy machine learning models and generative AI capabilities using Amazon SageMaker AI and Amazon Bedrock. Leverage agentic AI to improve manufacturing, optimize supply chain, get digital twins agents for Engineering and Design, all impacting sustainability.
Step 15
Integrate with cloud-hosted ERP, Supply Chain, Maintenance, and WMS/TMS manufacturing solutions, including Model Context Protocol (MCP) servers for industrial knowledge. Exchange data with enterprise platforms like Snowflake and Databricks to enhance manufacturing analytics.
Step 16
Visualize data with Amazon Managed Grafana from Amazon Redshift or Amazon S3 via Amazon Athena. Build dashboards using Amazon QuickSight and Amazon Athena.
Edge Services

This architecture diagram illustrates how to effectively collect data from the factory and send to AWS services in the cloud.

Download the architecture diagram Edge Services Step 1
Identify information related to industrial activities from on-premise equipment.
Step 2
Deploy and run cloud-developed machine learning models at the edge through AWS IoT Greengrass for defect detection and anomaly inference.
Step 3
Collect real-time data from edge devices and transmit data streams securely to AWS IoT SiteWise in the cloud. Leveraging partners like Litmus, Domatica's EasyEdge, Siemens Industrial Edge and Belden CloudRail accelerate integration with AWS IoT SiteWise Edge.
Step 4
Connect edge devices through AWS Shop Floor Connectivity Framework and partner solutions like HighByte. Stream and contextualize data securely to AWS IoT SiteWise, Amazon S3, AWS IoT Core, Amazon Kinesis, and Amazon MSK using industrial protocols.
Step 5
Connect your on-premises applications to Amazon S3 through AWS Storage Gateway using NFS and SMB file shares.
Step 6
Extend AWS infrastructure to plant premises with AWS Outposts. Run manufacturing applications locally using AWS services and integrate with AWS cloud infrastructure.
Cloud Services

This architecture diagram illustrates how data collected from the factory can be used with AWS cloud services.

Download the architecture diagram Cloud Services Step 1
Process diverse data types through AWS services. Stream real-time data through AWS IoT SiteWise, AWS IoT Core, Amazon Kinesis, and Amazon MSK. Transfer structured data using AWS DMS and AWS Glue. Process unstructured data with Amazon S3. Extract and update ERP data using Amazon AppFlow.
Step 2
Access AWS Analytics and AI/ML services through Amazon SageMaker Unified Studio. Find and query data and AI assets organization-wide. Collaborate on projects to build analytics and AI artifacts. Share data, models, and generative AI applications securely. Amazon SageMaker Unified Studio is integrated with Amazon Q Developer, which provides AI-powered assistance for code development, data analysis, and ML workflows.
Step 3
Process OT and IT data through AWS analytics services via Amazon SageMaker Unified Studio. Catalog data using AWS Glue Catalog, transform with AWS Glue Visual ETL, and analyze with Amazon Athena. Process streams with Amazon Managed Service for Apache Flink and generate insights using Amazon EMR.
Step 4
Unify data across Amazon S3 data lakes, including S3 Tables, and Amazon Redshift data warehouses with Amazon SageMaker Lakehouse. Build powerful analytics and AI/ML applications on a single copy of data using all Apache Iceberg-compatible tools and engines.
Step 5
Leverage Amazon Bedrock and agentic AI to improve manufacturing and optimize supply chain. Use Amazon Bedrock AgentCore, a comprehensive set of enterprise-grade services that help securely deploy and operate AI agents at scale, with components like AgentCore Runtime for low-latency serverless environments.
Cloud Services with Partners and Agentic AI

This architecture diagram illustrates how data collected from the factory can be used with AWS cloud services.

Download the architecture diagram Cloud Services with Partners and Agentic AI Step 1
Access functionality and tools from AWS Analytics and AI/ML services through Amazon SageMaker Unified Studio's single development environment. Query data and AI assets organization-wide. Collaborate on projects to build and share analytics and AI artifacts, including data, models, and generative AI applications, in a secure environment.
Step 2
Leverage Amazon Bedrock and agentic AI to improve manufacturing and optimize supply chain. Use Amazon Bedrock AgentCore, a comprehensive set of enterprise-grade services that help securely deploy and operate AI agents at scale, with components like AgentCore Runtime for low-latency serverless environments. Access production documentation and policies using Retrieval Augmented Generation (RAG), through managed Amazon Bedrock Knowledge Bases. Connect to Model Context Protocol (MCP) servers to allow agents to connect to systems where data lives.
Step 3
Integrate with cloud-hosted manufacturing solutions, including MCP servers. Exchange data with enterprise platforms using Iceberg REST Catalog. Build ETL pipelines with Amazon SageMaker Zero-ETL for Amazon S3, Amazon S3 Tables, and Amazon Redshift. Query data using Amazon Athena.