

# Data to Value


Enterprises need data-driven intelligence that delivers measurable business outcomes. Running SAP on AWS provides a scalable, secure, and flexible foundation to transform raw data into actionable value. The SAP and AWS Joint Reference Architecture (JRA) provides a framework for connecting data sources, harmonizing SAP and non-SAP data, and enabling AI and analytics-driven innovation through [SAP Business Data Cloud (SAP BDC)](https://www.sap.com/products/data-cloud.html) and [Amazon Sagemaker](https://aws.amazon.com/sagemaker/).

This guide outlines two key joint reference architectures that exemplify how organizations can leverage SAP and AWS services to maximize the value of their enterprise data through AI powered insights, while maintaining flexibility, scalability, and cost efficiency.

**Topics**
+ [

# Integrating data in SAP BDC with AWS data sources
](rise-jra-datatovalue-bdc-aws.md)
+ [

# AI Innovation with FedML-AWS and Sagemaker
](rise-jra-datatovalue-fedml-aws.md)

# Integrating data in SAP BDC with AWS data sources


Non-SAP data from AWS data sources can be harmonized with SAP data via SAP Datasphere data fabric architecture with SAP BDC. The integration architecture supports multiple AWS services, each with specific modes of integration based on live data or replication:

![\[SAP BDC with Managed Services\]](http://docs.aws.amazon.com/sap/latest/general/images/rise-jra-datatovalue-01.png)


 **A. Integration with Amazon Athena** 

Mode of Integration: Federating data live into SAP Datasphere

Amazon Athena is Amazon’s interactive query service that helps query and analyze data in S3. Non-SAP data from Athena can be federated live into remote tables in SAP Datasphere and augmented with SAP data for real-time analytics in [SAP Analytics Cloud](https://www.sap.com/products/data-cloud/cloud-analytics.html).

Here are the steps to integrate Athena with SAP Datasphere:

1. Prepare source with non-SAP and third party data

1. Configure Athena

1. onfigure necessary IAM user and authorizations

1. Setup SAP Datasphere Connection to Athena

1. Build models in SAP Datasphere

This enables live data federation without replicating data, thus reduces cost, provides fast insights, and enterprise-grade security. For detailed step by step, visit [Federating Queries from SAP Datasphere to Amazon S3 via Amazon Athena](https://github.com/SAP-samples/sap-bdc-explore-hyperscaler-data/blob/main/AWS/athena-integration.md).

 **B. Integration with Amazon Redshift** 

Mode of Integration: Federating data live into SAP Datasphere

Amazon Redshift is a fully managed, petabyte-scale data warehouse service optimized for analytical workloads. Through SAP Datasphere data federation architecture, Redshift data can be augmented with SAP data to build unified data models and analytics in SAP Analytics Cloud. [Smart Data Integration (SDI)](https://help.sap.com/docs/HANA_SMART_DATA_INTEGRATION/bf2f0282053648f8a1ef873e65ded81a/323ff4c3c12040bab8f1222a901dd95d.html) connects SAP Datasphere with Redshift via [Camel JDBC Adapter](https://help.sap.com/docs/HANA_SMART_DATA_INTEGRATION/7952ef28a6914997abc01745fef1b607/598cdd48941a41128751892fe68393f4.html?locale=en-US), enabling the creation of virtual tables and real-time or snapshot replication.

Here are the steps to integrate Redshift with SAP Datasphere:

1. Create On-Premise Agent in SAP Datasphere

1. Set Up Redshift Access

1. Configure SAP SDI DP Agent

1. Register Camel JDBC Adapter in SAP Datasphere

1. Upload Third-Party Drivers in SAP Datasphere

1. Create Local Connection to Redshift in SAP Datasphere

1. Import Remote Tables from Redshift

This setup enables live federated queries from SAP Datasphere to Redshift without replicating the data. Benefits include real-time access to Redshift data, pushdown queries for performance optimization, and no data duplication in SAP Datasphere. For detailed step by step, visit [Data Federation between SAP Datasphere and Amazon Redshift](https://github.com/SAP-samples/sap-bdc-explore-hyperscaler-data/blob/main/AWS/redshift-integration.md).

 **C. Integration with Amazon S3** 

Modes of Integration: Replicating data with Replication Flows, Importing data into SAP Datasphere using Data Flows

Amazon S3 provides object storage service which is highly scalable, durable, available and secure. Non-SAP data from S3 buckets can be imported into SAP Datasphere through the Data Flow feature for use with applications such as Financial Planning or business analytics in SAP Analytics Cloud.

Here are the steps to integrate Amazon S3 with SAP Datasphere:

1. Prepare source data in an S3 bucket

1. Configure necessary IAM user and authorizations

1. Create S3 Connection in SAP Datasphere

1. Create a Data Flow

This process allows SAP Datasphere to connect to S3, access non-sap data, and use that data in combination with internal SAP datasets via Data Flows. For detailed step by step, visit [Data integration between SAP Datasphere and in Amazon S3](https://github.com/SAP-samples/sap-bdc-explore-hyperscaler-data/blob/main/AWS/s3-integration.md).

You can find out more from SAP Architecture Center under [Integration with AWS data sources](https://architecture.learning.sap.com/docs/ref-arch/a07a316077/1).

# AI Innovation with FedML-AWS and Sagemaker


In today’s data-driven enterprises, machine learning models are only as powerful as the data they can access. However, business-critical data often resides within SAP systems like SAP BDC, while advanced model development typically takes place in cloud-native platforms like Amazon Sagemaker.

FedML-AWS for Amazon Sagemaker bridges this gap by providing a secure, efficient, and unified framework for federated model training and deployment across SAP and AWS ecosystems. By eliminating data duplication and enabling real-time access to SAP data, FedML-AWS helps accelerate AI initiatives, ensure data governance, and reduce operational complexity, all while leveraging the scalability and performance of AWS and the business context of SAP. With minimal setup, FedML-AWS enables data discovery, model training, and deployment across both SAP and AWS environments to extract value from data.

![\[FedML and Amazon Sagemaker\]](http://docs.aws.amazon.com/sap/latest/general/images/rise-jra-datatovalue-02.png)


FedML, a Python library, is directly imported into Amazon Sagemaker notebook instances. When most training data resides in AWS, but critical SAP data with business semantics is also needed for training, it securely connects to SAP Datasphere (part of BDC) via Python/SQLDBC connectivity, enabling federated access to SAP business data required for model training in Sagemaker.

For more technical details on methods that enable the training data to be read from SAP Datasphere (part of BDC) and trained using Machine Learning model on Amazon Sagemaker, visit [FedML-AWS](https://github.com/SAP-samples/datasphere-fedml/tree/main/AWS). You can find out more from SAP Architecture Center under [Integration with FedML-AWS for Amazon Sagemaker](https://architecture.learning.sap.com/docs/ref-arch/8e1a5fbce3/1).

By combining the strengths of SAP Business Data Cloud (BDC) and AWS services, organizations can unlock the full potential of their enterprise data. From operational systems to advanced AI and analytics, whether harmonizing datasets across Amazon S3, Redshift, and Athena or enabling federated model training with FedML-AWS and Amazon Sagemaker, these architectures provide a scalable and secure foundation for innovation. Together, SAP and AWS empower businesses to move from data silos to data-driven intelligence, accelerating time to insight, optimizing decision-making, and driving measurable business value across the enterprise.