

# Data integration
<a name="rise-data-integration"></a>

RISE with SAP Extensibility for Data Integration with AWS is a technical framework that enables data flow between SAP systems, AWS services, and third-party solutions. This integration architecture provides standardized APIs, connectors, and protocols to establish secure communication channels, addressing the critical need for seamless enterprise data integration in modern cloud environments.

The RISE with SAP Extensibility for Data Integration outlines two primary data handling and integration mechanisms.

**Topics**
+ [Data Replication](rise-data-replication.md)
+ [Replicating data using AWS Services](rise-data-replication-awsmanaged.md)
+ [Replicating data using SAP services](rise-data-replication-sap.md)
+ [Replicating data using Partner Solutions](rise-data-replication-partner.md)
+ [Data Federation using AWS Services](rise-data-federation.md)

# Data Replication
<a name="rise-data-replication"></a>

Data Replication from SAP is a crucial step in making the data usable for reporting, analysis, and integration with other systems. Below is the reference architecture on how this can be done in AWS.

![\[Overall Data replication\]](http://docs.aws.amazon.com/sap/latest/general/images/rise-data-replication.png)


# Replicating data using AWS Services
<a name="rise-data-replication-awsmanaged"></a>

![\[Data replication using Managed Services\]](http://docs.aws.amazon.com/sap/latest/general/images/rise-data-replication-aws-services.png)


 ** AWS Glue** 

 [AWS Glue](https://aws.amazon.com/glue/) is a serverless data integration service that makes it easy for analytics users to discover, prepare, move, and integrate data from multiple sources. With AWS Glue, you can discover and connect to SAP using OData and manage your data in a centralized data catalog. You can visually create, run, and monitor extract, transform, and load (ETL) pipelines to load SAP data into your data lakes and data warehouses.

The [Connecting to SAP OData using Glue](https://docs.aws.amazon.com/glue/latest/dg/connecting-to-data-sap-odata.html) user guide offers comprehensive instructions for setting up Glue ETL jobs, configuring SAP OData connections, and reading data from SAP, including handling incremental transfers.

 [AWS Glue Zero-ETL](https://docs.aws.amazon.com/glue/latest/dg/zero-etl-using.html) is a set of fully managed integrations by AWS that minimizes the need to build ETL data pipelines for common ingestion and replication use cases. It makes data available in Amazon SageMaker Lakehouse and Amazon Redshift from multiple operational, transactional, and application sources. Leveraging the SAP OData Connectors, you can create full data replication jobs from SAP, with fully managed replication (Inserts, updates and deletions) as well as schema evolution.

 AWS Glue and Glue Zero-ETL serve distinct roles in data integration, with each offering unique advantages for different use cases. While AWS Glue excels in complex ETL operations, data discovery, preparation, and extraction, particularly for specialized scenarios like SAP ODP-based replication. AWS Glue Zero-ETL is designed as a more streamlined, no-code solution for fully managed data replication scenarios.

 AWS Glue requires more hands-on management, including code deployment and maintenance, but offers greater flexibility and control over data transformation processes. AWS Glue performance is enhanced by its serverless, scale-out Apache Spark environment, which allows you to allocate Data Processing Units (DPUs) for scalable compute. This allows parallel processing and event-driven execution.

# Replicating data using SAP services
<a name="rise-data-replication-sap"></a>

![\[Data replication using SAP Services\]](http://docs.aws.amazon.com/sap/latest/general/images/rise-data-replication-sap-services.png)


 **SAP BDC / Datasphere** 

 [SAP Datasphere](https://www.sap.com/products/data-cloud/datasphere.html) offers various connection types such as SAP ABAP Connections, SAP ECC Connections, SAP S/4HANA Cloud Connections supporting RFC and ODP protocols. Refer to [SAP BDC / Datasphere documentation](https://help.sap.com/docs/SAP_DATASPHERE/be5967d099974c69b77f4549425ca4c0/eb85e157ab654152bd68a8714036e463.html) to choose most appropriate connectivity to replicate SAP data. Using [premium outbound integration for [Amazon Simple Storage Connection (Amazon S3)](https://help.sap.com/docs/SAP_DATASPHERE/be5967d099974c69b77f4549425ca4c0/a7b660a0a4ef4a4fbee57b44f5b2147d.html), configure SAP Datasphere replication flow to ingest data to Amazon S3.

 **SAP Data Services** 

 [SAP Data Services](https://www.sap.com/products/technology-platform/data-services.html) offer various connections to replicate data from SAP ECC data. Refer to [SAP Data Services documentation](https://help.sap.com/docs/SAP_DATA_SERVICES) to choose most appropriate connectivity. SAP Data Services offers [Amazon Redshift Datastore](https://help.sap.com/docs/SAP_DATA_SERVICES/af6d8e979d0f40c49175007e486257f0/731d7026ae3b4fef9ebadfbe23ffff12.html) and [Amazon S3 datastore](https://help.sap.com/docs/SAP_DATA_SERVICES/af6d8e979d0f40c49175007e486257f0/e1ed075446344b5ca098e2382cfca78d.html) to ingest data to AWS. It also offers options for [Amazon S3 file location protocol](https://help.sap.com/docs/SAP_DATA_SERVICES/af6d8e979d0f40c49175007e486257f0/a611106693ea422eb0b04705298516b7.html) such as encryption type, compression type, batch-size, number of threads, Amazon S3 storage class, etc.

# Replicating data using Partner Solutions
<a name="rise-data-replication-partner"></a>

 AWS Partner Solutions offer ready to deploy solutions with enhanced features, such as pre-built connectors, specialized data pipelines, and advanced optimization techniques that reduce complexity and improve the speed of deployment.

To find and deploy a solution that fits your specific needs, you can explore the [AWS Partner Solutions Finder](https://partners.amazonaws.com/search/partners) or browse through the [AWS Marketplace](https://aws.amazon.com/marketplace), where you can search for and quickly deploy partner solutions tailored to your unique SAP use case.

 **Further Resources** 

The [Guidance for SAP Data Integration and Management on AWS](https://aws.amazon.com/solutions/guidance/sap-data-integration-and-management-on-aws/) provides the essential data foundation to build data and analytics solutions. It shows how to integrate data from SAP ERP source systems and AWS in real-time or batch mode, with change data capture, using AWS services, SAP products, and AWS Partner Solutions. It includes an overview reference architecture showing how to ingest SAP systems to AWS in addition to five detailed architectural patterns that complement SAP-supported mechanisms (such as OData, ODP, SLT, and BTP) using AWS services that are highlighted above, SAP products, and AWS Partner Solutions.

# Data Federation using AWS Services
<a name="rise-data-federation"></a>

Data federation is a data management strategy that enables, real-time analytics, single source-of-trust, no data duplication or expensive pipelines.

When there is a business requirement to have a consolidated data for transactional, analytics, machine learning, it is preferred for the data to be accessed from the source rather than replicated to avoid latency, inconsistency and extra storage cost.

In the context of SAP and AWS services, it allows organizations to access, combine, and analyze data from both SAP systems and AWS cloud services seamlessly.

![\[Data Federation\]](http://docs.aws.amazon.com/sap/latest/general/images/rise-data-federation.png)


 **Amazon Athena** 

 [Amazon Athena](https://aws.amazon.com/athena/) is a serverless, scalable and flexible interactive query service by AWS that allows to analyze data directly in Amazon S3. The data stored in Amazon S3 from multiple sources can be further transformed into tables and views using Amazon Athena and queried to replicate meaningful information in a structured way.

Data in Athena can be accessed from SAP Datasphere through [data federation](https://discovery-center.cloud.sap/missiondetail/3401/3441/) from SAP Datasphere connections. Users can also access SAP Datasphere tables and views from Athena by [querying SAP HANA](https://aws.amazon.com/blogs/big-data/query-sap-hana-using-athena-federated-query-and-join-with-data-in-your-amazon-s3-data-lake/) using an [Athena Federated Query](https://docs.aws.amazon.com/athena/latest/ug/connect-to-a-data-source.html).

Data can also be federated to the SAP HANA Cloud by configuring Athena as a remote source using the [Smart Data Access – Athena adapter](https://community.sap.com/t5/technology-blogs-by-sap/federating-queries-in-hana-cloud-from-amazon-athena-using-athena-api/ba-p/13476091). The [Athena Federated Query connection](https://aws.amazon.com/blogs/big-data/query-sap-hana-using-athena-federated-query-and-join-with-data-in-your-amazon-s3-data-lake/) can also be used to read data from a stand-alone SAP HANA Cloud environment.

 **Amazon Redshift** 

 [Amazon Redshift](https://aws.amazon.com/redshift/) iis a fully managed, peta-byte scale data warehouse service from AWS. Customers have built their data warehouses and build data models for analytics and reporting.

 [Data federation](https://discovery-center.cloud.sap/missiondetail/3406/3446/) from Amazon Redshift into SAP Datasphere is possible with SAP HANA Smart Data Integration (SDI) or the SAP Data Provisioning Agent. Amazon Redshift data can also be federated through the Athena Federated Query data source connector.

 **Further resources** 

The [Guidance for Data Federation](https://aws.amazon.com/solutions/guidance/data-federation-between-sap-and-aws/) between SAP and AWS outlines the process of federating data between SAP and AWS cloud analytics services, enabling you to establish a data mesh architecture. By federating data between SAP and AWS. you can easily transform and visualize your data in a scalable, secure, and cost-effective way, helping you inform your decision-making.