# Architecture guidance for availability and reliability of SAP on AWS
<a name="architecture-guidance-of-sap-on-aws"></a>

August 2021

This guide is part of a content series that provides detailed information about hosting, configuring, and using SAP technologies in the Amazon Web Services (AWS) Cloud. For more information, see [SAP on AWS Technical Documentation](https://aws.amazon.com/sap/docs/).

## Overview
<a name="arch-guide-overview"></a>

This guide provides a set of architecture guidelines, strategies, and decisions for deploying SAP NetWeaver-based systems with a highly available and reliable configuration on AWS.

In this guide we cover:
+ Introduction to SAP high availability and reliability
+ Architecture guidelines and decision consideration
+ Architecture patterns and recommended usage

This guide is intended for users who have previous experience designing high availability and disaster recovery (HADR) architectures for SAP.

This guide does not cover the business requirements determining the need for HADR and/or the implementation details for a specific partner or customer solution.

## Prerequisites
<a name="arch-guide-prerequisites"></a>

### Specialized knowledge
<a name="arch-guide-specialized-knowledge"></a>

Before following the configuration instructions in this guide, we recommend familiarizing yourself with the following AWS services. (If you are new to AWS, see [Getting Started with AWS](https://aws.amazon.com/getting-started).)
+  [Amazon EC2](https://aws.amazon.com/ec2) 
+  [Amazon EBS](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AmazonEBS.html) 
+  [Amazon VPC](https://aws.amazon.com/vpc) 
+  [Amazon EFS](https://aws.amazon.com/efs) 
+  [Amazon S3](https://aws.amazon.com/s3) 

### Recommended reading
<a name="arch-guide-recommended-reading"></a>

Before reading this document, we recommend understanding key concepts and best practices from these guides:
+  [SAP on AWS Overview and Planning](https://docs.aws.amazon.com/sap/latest/general/sap-on-aws-overview.html) 
+  [Getting Started with Architecting SAP on the AWS Cloud](https://aws.amazon.com/blogs/awsforsap/getting-started-with-architecting-sap-on-the-aws-cloud) 

# Introduction
<a name="arch-guide-introduction"></a>

For decades, SAP customers protected SAP workloads on premise with two common patterns: high availability and disaster recovery. The advent of cloud computing provided an opportunity to rethink HADR capabilities for SAP, using modern architectures and technologies.

Let’s recap the SAP system design and single points of failure that are part of the SAP n-tier architecture.

## SAP NetWeaver architecture single points of failure  
<a name="arch-guide-sap-netweaver-architecture-single-points-of-failure"></a>

 **Figure 1: SAP single points of failure** 

![\[SAP single points of failure\]](http://docs.aws.amazon.com/sap/latest/general/images/arch-guidance-1.png)


Figure 1 shows the typical SAP NetWeaver architecture, which has several single points of failure which are listed below:
+ SAP Central Services (message server and enqueue processes)
+ SAP Application Server
+ NFS (shared storage)
+ Database
+ SAP Web Dispatcher

For the SAP Central Services and Database, protection can be added by deploying additional hosts. For example, an additional host running the SAP replicated enqueue can protect the loss of application level locks (enqueue locks) and an additional host running a secondary database instance can protect against data loss.

However, the inherent design of these single points of failure limits the ability to easily take advantage of cloud native features to provide high availability and reliability.

Amazon Elastic File Service (Amazon EFS) is a highly available and durable managed NFS service that runs actively across multiple physical locations (AWS Availability Zones). This service can help protect one of the SAP single points of failure.

## High availability and disaster recovery
<a name="arch-guide-high-availability-and-disaster-recovery"></a>

High availability (HA) is the attribute of a system to provide service during defined periods, at acceptable or agreed upon levels and to mask unplanned outages from end users. This is often achieved by using clustered servers. These servers provide automated failure detection, recovery or highly resilient hardware, robust testing, and problem and change management.

Disaster recovery (DR) protects against unplanned major outages, such as site disasters, through reliable and predictable recovery on a different hardware and/or physical location. The loss of data due to corruption or malware is considered a logical disaster event. It is normally resolved in a separate solution, such as recovery from the latest backup or storage snapshot. Logical DR does not necessarily imply a fail over to another facility.

From the perspective of documented and measurable data points, HADR requirements are often defined in terms of the following:
+  **Percentage uptime** is the percentage of uptime in a given period (monthly or annual).
+  **Mean time to recovery (MTTR)** is the average time required to recover from failure.
+  **Return to service (RTS)** is the time it takes to bring the system back to service for the users.
+  **Recovery time objective (RTO)** is the maximum length of time that a system or service can be down, how long a solution takes to recover, and the time it takes for a service to be available again.
+  **Recovery point objective (RPO)** is how much data a business is willing to lose, expressed in time. It’s the maximum time between a failure and the recovery point.

 **Figure 1: SAP single points of failure** 

![\[Recovery from a disruptive event\]](http://docs.aws.amazon.com/sap/latest/general/images/arch-guidance-2.png)


≈

## On premises vs. cloud deployment patterns
<a name="arch-guide-on-premises-vs.-cloud-deployment-patterns"></a>

Traditionally, customers with high availability requirements would deploy their primary compute capabilities in a single data center or hosting facility, often in two separate rooms or data center halls with disparate cooling and power, and high-speed network connectivity. Some customers would run two hosting facilities in close proximity, with a separation of compute capabilities, yet close enough to not be impacted by network latency.

To meet disaster recovery requirements (the preceding scenarios represent an elevated risk to unforeseen location failure), many customers would extend their architecture to include a secondary location where a copy of their data resided, with additional idle compute capacity. The distance between the primary and secondary locations often created the need for asynchronous transfer of data which impacted the recovery point objective. This was the standard and generally accepted architecture pattern for high availability and disaster recovery for many industries and companies running SAP.

 **Figure 3: On-premises disaster recovery** 

![\[On-premises disaster recovery\]](http://docs.aws.amazon.com/sap/latest/general/images/arch-guidance-3.png)


In Figure 3, we give an example of an approach that customers often take on premises. In **Location 1**, the customer has two hosting facilities often separate rooms or halls in the same data center where they deploy a high availability architecture for the SAP single point of failure. **Location 2** is the disaster recovery location in which the SAP systems are recovered, in the event of a significant failure of both hosting facilities in **Location 1**.

Customers migrating their SAP workloads to cloud providers still revert to this architecture and map it to AWS Regions and Availability Zones (AZs) as depicted in Figure 4. While this architecture can work in your environment, it does not follow the [AWS Well-Architected Framework](https://aws.amazon.com/architecture/well-architected/) which helps cloud architects build secure, high-performing, resilient, and efficient infrastructure for their applications.

 **Figure 4: On-premises to AWS region mapping approach** 

![\[Example mapping of on-premises data centers to Regions\]](http://docs.aws.amazon.com/sap/latest/general/images/arch-guidance-4.png)


 AWS isolates facilities geographically in Regions and Availability Zones. A Multi-AZ approach provides distance while maintaining performance for the primary compute capacity. This approach (Figure 5) greatly reduces the risk of location failure.

 **Figure 5: Alternative approach for on premises to AWS region mapping** 

![\[Alternative approach for mapping on-premises data centers to Regions\]](http://docs.aws.amazon.com/sap/latest/general/images/arch-guidance-5.png)


With the risk of location failure significantly reduced for the primary compute capacity, the requirements for a second Region can be evaluated based on business requirements. You can rapidly deploy required capacity in the same or different Region with AWS. Idle hardware is no longer an issue. Data backups can be stored on Amazon Simple Storage Service (Amazon S3) in a single AWS Region or in multiple AWS Regions by leveraging cross-Region replication. This architecture can be simplified and be made readily available (Figure 6).

 **Figure 6: Single AWS Region approach** 

![\[Single-Region approach\]](http://docs.aws.amazon.com/sap/latest/general/images/arch-guidance-6.png)


In addition to considering the impact of infrastructure or hosting facility failure, another scenario to consider is the loss of business data due to accidental or malicious technical activity.

Loss of business data due to accidental or malicious technical activity is referred to as *logical disaster recovery*. It requires a decision to restore the business data from a good local copy. To enable this, decisions need to be made with regard to the storage location of the data and how it will be used in the event of a *logical disaster recovery*.

Further in this guide, we detail the key architecture guidelines, architecture patterns, and decisions to consider for your availability and reliability requirements.

# Architecture guidelines and decisions 
<a name="arch-guide-architecture-guidelines-and-decisions"></a>

This section will provides a brief overview of the AWS services typically used for SAP workloads and some of the key points to understand when designing your architecture for hosting SAP on AWS. If you are already familiar with these AWS services, you can skip this section.

## Regions and Availability Zones
<a name="arch-guide-regions-and-availability-zones"></a>

The [AWS Global Infrastructure](https://aws.amazon.com/about-aws/global-infrastructure) consists of [AWS Regions](https://aws.amazon.com/about-aws/global-infrastructure/regions_az/?p=ngi&loc=2#Regions) and [Availability Zones](https://aws.amazon.com/about-aws/global-infrastructure/regions_az/?p=ngi&loc=2#Availability_Zones) (AZs). For more details on the AWS Global Infrastructure, see [Regions and Availability Zones](https://aws.amazon.com/about-aws/global-infrastructure/).

### Regions
<a name="arch-guide-regions"></a>

 AWS has a global footprint and ensures that customers are served across the world. AWS maintains multiple Regions in North America, South American, Europe, Asia Pacific, and the Middle East.

An AWS Region is a collection of AWS resources in a geographic area. Each Region is isolated and independent. For a list of Region names and codes, see [Regional endpoints](https://docs.aws.amazon.com/general/latest/gr/rande.html#region-names-codes).

Regions provide fault tolerance, stability, and resilience. They enable you to create redundant resources that remain available and unaffected in the unlikely event of an outage.

 AWS Regions consist of multiple Availability Zones (AZs), typically 3. An Availability Zone is a fully isolated partition of the AWS infrastructure. It consists of discrete data centers housed in separate facilities, with redundant power, networking, and connectivity.

You retain complete control and ownership over the AWS Region in which your data is physically located, making it easy to meet Regional compliance and data residency requirements.

### Availability Zones
<a name="arch-guide-availability-zones"></a>

Availability Zones (AZs) enable customers to operate production applications and databases that are more highly available than would be possible from a single data center. Distributing your applications across multiple zones provides the ability to remain resilient in the face of most failure modes, including natural disasters or system failures.

Each Availability Zone can be multiple data centers. At full scale, it can contain hundreds of thousands of servers. They are fully isolated partitions of the AWS global infrastructure. With its own powerful infrastructure, an Availability Zone is physically separated from any other zones. There is a distance of several kilometers, although all are within 100 km (60 miles of each other). This distance provides isolation from the most common disasters that could affect data centers (i.e. floods, fire, severe storms, earthquakes, etc.).

All Availability Zones (AZs) within a Region are interconnected with high-bandwidth and low-latency networking, over fully redundant and dedicated metro fiber. This ensures high-throughput, low-latency networking between zones. The network performance is sufficient to accomplish synchronous replication between zones.

 AWS Availability Zones (AZs) enable our customers to run their applications in a highly-available manner. To be highly available, an application needs to run in more than one location simultaneously with the exact same data, thus allowing for a seamless fail over with minimal downtime, in the event of a disaster.

### Services
<a name="arch-guide-services"></a>

Our general policy is to deliver AWS services, features, and instance types to all AWS Regions within 12 months of general availability, based on customer demand, latency, data sovereignty, and other factors. You can share your interest for local Region delivery, request service roadmap information, or gain insight on service interdependency (under NDA) by contacting your [AWS sales representative](https://aws.amazon.com/contact-us).

Due to the nature of the service, some AWS services are delivered globally rather than Regionally, such as Route 53, Amazon Chime, Amazon WorkDocs, Amazon WorkMail, WorkSpaces, and Amazon WorkLink.

Other services, such as Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Elastic Block Store (Amazon EBS) are zonal services. When you create an Amazon EC2 or Amazon EBS resource for launch, you need to specify the required Availability Zone within a Region.

### Selecting the AWS Regions
<a name="arch-guide-selecting-the-aws-regions"></a>

When selecting the AWS Region(s) for your SAP environment deployment, you should consider the following:
+ Proximity to on-premises data centers, systems, and end users to minimize network latency.
+ Data residency and compliance requirements.
+ Availability of the AWS products and services that you plan to use in the Region. For more details, see [Region Table](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services) .
+ Availability of the Amazon EC2 instance types that you plan to use in the Region. For more details, see [Amazon EC2 Instance Types for SAP](https://aws.amazon.com/sap/instance-types).
+ Pricing variation between different AWS Regions. For more details, see [SAP on AWS Pricing and Optimization guide](https://docs.aws.amazon.com/sap/latest/general/sap-on-aws-pricing-guide.html) . 

### Multi-Region considerations
<a name="arch-guide-multi-region-considerations"></a>

When deploying across multiple Regions, an important consideration is the associated cost and management effort for core services required in each Region such as networking, security, and audit services.

#### Network latency
<a name="arch-guide-network-latency"></a>

If you decide on a multiple Region approach, you should consider the impact of any increase in the network latency to the secondary Region from your on-premises locations.

#### Cross-Regional data transfer
<a name="arch-guide-cross-regional-data-transfer"></a>

 AWS provides several methods of data transfer between Regions. These methods are relevant when designing an SAP Architecture for disaster recovery. You should consider any data residency requirements when transferring data to another AWS Region, the costs associated with the data transfer ([cross-Region peering](#arch-guide-cross-region-peering) and/or [Amazon S3 replication](#arch-guide-s3-replication)), and storage in the secondary Region.

#### Tier 0 services
<a name="arch-guide-tier-0-services"></a>

When using an AWS Region, there are a number of Tier 0 services that you need before deploying an SAP workload. These include DNS, Active Directory, and/or LDAP as well as any AWS or ISV-provided security and compliance products and services.

## AWS accounts
<a name="arch-guide-aws-accounts"></a>

While there is no one-size-fits-all answer for how many AWS accounts a particular customer should have, most organizations want to create more than one AWS account. Multiple accounts provide the highest level of resource and billing isolation.

In the context of SAP workloads, it is common for customers to deploy the Production environment in a separate AWS account. It helps isolate the production environment from the rest of the SAP landscape.

 [AWS Organizations](https://docs.aws.amazon.com/organizations/latest/userguide/orgs_introduction.html) is an account management service that enables you to consolidate multiple AWS accounts into an *organization* that you create and centrally manage. AWS Organizations includes account management and consolidated billing capabilities. It enables you to better meet the budgetary, security, and compliance needs of your business. As an administrator of an organization, you can create accounts in your organization and invite existing accounts to join the organization.

 [AWS Landing Zone](https://aws.amazon.com/solutions/implementations/aws-landing-zone/) is a solution that helps customers more quickly set up a secure, multi-account AWS environment based on AWS best practices. You can save time by automating the setup of an environment for running secure and scalable workloads while implementing an initial security baseline through the creation of core accounts and resources. It also provides a baseline environment to get started with a multi-account architecture, AWS Identity and Access Management, governance, data security, network design, and logging.

 **Note:** The AWS Landing Zone solution is delivered by AWS Solutions Architects or Professional Services consultants to create a customized baseline of AWS accounts, networks, and security policies.

Consider using the AWS Landing Zone solution if you are looking to set up a configurable landing zone with rich customization options through custom add-ons such as, Active Directory, and change management through a code deployment and configuration pipeline.

 [AWS Control Tower](https://docs.aws.amazon.com/controltower/latest/userguide/what-is-control-tower.html) provides the easiest way to set up and govern a secure, compliant, multi-account AWS environment based on best practices established by working with thousands of enterprises. With AWS Control Tower, your distributed teams can provision new AWS accounts quickly. Meanwhile your central cloud administrators will know that all accounts are aligned with centrally established, company-wide compliance policies.

Consider using the AWS Control Tower to set up a new AWS environment based on a landing zone with pre-configured blueprints. You can interactively govern your accounts with pre-configured guardrails.

## Compute
<a name="arch-guide-compute"></a>

 [Amazon Elastic Compute Cloud](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/concepts.html) (Amazon EC2) provides scalable computing capacity in the Amazon Web Services (AWS) cloud. An Amazon EC2 instance is launched in a specific Availability Zone within a specified Amazon Virtual Private Cloud (Amazon VPC).

When the Amazon EC2 instances are deployed across two or more Availability Zones within a single Region then AWS offers an [SLA](https://aws.amazon.com/compute/sla) of 99.99%.

### Instance types
<a name="arch-guide-instance-types"></a>

A range of [Amazon EC2 instance types](https://aws.amazon.com/sap/instance-types) are supported by SAP. When selecting the instance type for your SAP workload, you should consider which tiers allow flexibility on the instance used (application tier). Also consider which tiers will require the use of a specific instance type (database tier) based on compute, memory, storage throughput, and license compliance requirements.

For the tiers with specific instance type requirements and without flexibility to change during a failure scenario, consider having a capacity reservation using [Reserved Instances](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-reserved-instances.html) or [On-Demand Capacity Reservations](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-capacity-reservations.html) within the required Availability Zones and Regions where the instance will run. This approach is called static stability. For more information, see [Static stability using Availability Zones](https://aws.amazon.com/builders-library/static-stability-using-availability-zones).

### Reserved Instances
<a name="arch-guide-reserved-instances"></a>

 [Reserved Instances](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-reserved-instances.html) provide significant savings on your Amazon EC2 costs compared to on-demand instance pricing. Reserved Instances are not physical instances. They are a billing discount applied to the use of On-Demand Instances in your account. To avail the discount benefit, these on-demand instances must match certain attributes, such as instance type and Region.

When you deploy Amazon EC2 across multiple Availability Zones for high availability, we recommend that you use zonal Reserved Instances. In addition to the savings over the on-demand instance pricing, a zonal Reserved Instance provides a capacity reservation in the specified Availability Zone. This ensures that the required capacity is readily available as and when you need it.

For billing purposes, the [consolidated billing](https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/consolidated-billing.html) feature of AWS Organizations treats all of the accounts in the organization as one account. This means that all accounts in the organization can receive the hourly cost benefit of Reserved Instances that are purchased by any other account.

### Savings Plans
<a name="arch-guide-savings-plans"></a>

 [Savings Plans](https://aws.amazon.com/savingsplans) is a flexible pricing model that provides savings of up to 72% on your AWS compute usage. It offers lower prices on Amazon EC2 instance usage, regardless of instance family, size, tenancy or AWS Region. The Savings Plan model also applies to AWS Fargate and AWS Lambda usage.

Savings Plans offer significant savings over on-demand, just like the Amazon EC2 Reserved Instances, in exchange for a commitment to use a specific amount of compute power (measured in \$1/hour) for a one- or three-year period.

### On-Demand Capacity Reservations
<a name="arch-guide-on-demand-capacity-reservations"></a>

 [On-Demand Capacity Reservations](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-capacity-reservations.html) enable you to reserve capacity for your Amazon EC2 instances in a specific Availability Zone for any duration. This gives you the ability to create and manage Capacity Reservations independently, with the billing discounts offered by Savings Plans or Regional Reserved Instances. You can create Capacity Reservations at any time, you ensure that you always have access to Amazon EC2 capacity when you need it, for as long as you need it. You can create Capacity Reservations at any time, without entering into a one-year or three-year term commitment, and the capacity is available immediately. When you no longer need the reservation, we recommend that you [cancel the Capacity Reservation](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/capacity-reservations-using.html#capacity-reservations-release) to stop incurring charges for it.

### Instance Family Availability across Availability Zones
<a name="arch-guide-instance-family-availability-across-azs"></a>

Certain Amazon EC2 instance families (for example, X1 and High Memory) are not available across all Availability Zones within a Region. You should confirm the instance types required for your SAP workload and check if they are available in your target Availability Zones.

### Amazon EC2 auto recovery
<a name="arch-guide-ec2-auto-recovery"></a>

 [Amazon EC2 auto recovery](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-recover.html) is an Amazon EC2 feature that automatically recovers the instance within the same Availability Zone, if it becomes impaired due to an underlying hardware failure or a problem that requires AWS involvement to repair.

You can enable auto recovery for Amazon EC2 instances by creating an Amazon CloudWatch alarm which monitors the instance status. Examples of problems that cause system status checks to fail include:
+ Loss of network connectivity
+ Loss of system power
+ Software issues on the physical host
+ Hardware issues on the physical host that impact network reachability

Though it typically takes under 15 minutes for a failed instance to restart, Amazon EC2 auto recovery does not offer an SLA. Therefore, if the recovery of the application that’s running on the failed host is critical (for example, SAP Database or SAP Central Services), you should consider using [clustering across two Availability Zones](https://docs.aws.amazon.com/sap/latest/sap-hana/sap-oip-sap-on-aws-high-availability-setup.html) to help ensure high availability.

### High Memory Bare Metal Dedicated Hosts
<a name="arch-guide-high-memory-dedicated-hosts"></a>

 [Amazon EC2 High Memory Instances](https://aws.amazon.com/ec2/instance-types/high-memory) are specifically designed to run large in-memory databases, such as SAP HANA. High Memory Bare Metal instances are available on Amazon EC2 [Dedicated Hosts](https://aws.amazon.com/ec2/dedicated-hosts) on a one- or three-year reservation.

High Memory instances support [Dedicated Host Recovery](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/dedicated-hosts-recovery.html). Host recovery automatically restarts your instances on a new replacement host if failures are detected on your Dedicated Host. Host recovery reduces the need for manual intervention and lowers the operational burden in case of an unexpected Dedicated Host failure.

We recommend a second-High Memory instance in a different Availability Zone of your chosen Region to protect against zone failure.

### Amazon EC2 maintenance
<a name="arch-guide-ec2-maintenance"></a>

When AWS maintains the underlying host for an instance, it schedules the instance for maintenance. There are two types of maintenance events:
+ During network maintenance, scheduled instances lose network connectivity for a brief period of time. Normal network connectivity to your instance is restored after maintenance is complete.
+ During power maintenance, scheduled instances are taken offline for a brief period, and then rebooted. When a reboot is performed, all of your instance’s configuration settings are retained.

Additionally, we frequently upgrade our Amazon EC2 fleet with many patches and upgrades being applied to instances transparently. However, some updates require a short reboot. Reboots such as these should be infrequent but necessary to apply upgrades that strengthen our security, reliability, and operational performance.

There are two kinds of reboots that can be required as part of Amazon EC2 scheduled maintenance:
+ Instance reboots are reboots of your virtual instance and are equivalent to an operating system reboot.
+ System reboots require reboots of the underlying physical server hosting an instance.

You can view any upcoming scheduled events for your instances in the AWS Management Console or using the API tools or command line.

If you do not take any action, the impact on your instance is the same in both cases: during your [scheduled maintenance](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-instances-status-check_sched.html#schedevents_actions_maintenance) window your instance will experience a reboot that in most cases takes a few minutes.

Alternatively, you can migrate your instance to a new host by performing a stop and start on your instance. For more information, see [Stop and start your instance](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Stop_Start.html). You can automate an immediate stop and start in response to a scheduled maintenance event.

## Networking
<a name="arch-guide-networking"></a>

### Amazon Virtual Private Cloud and subnets
<a name="arch-guide-amazon-virtual-private-cloud-and-subnets"></a>

An [Amazon Virtual Private Cloud](https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html) (Amazon VPC) is a virtual network dedicated to your AWS account. It is logically isolated from other virtual networks in the AWS Cloud. You can launch your AWS resources, such as Amazon EC2 instances, into your VPC.

When you create a VPC, you must specify a range of IPv4 addresses for the VPC in the form of a Classless Inter-Domain Routing (CIDR) block, for example, 10.0.0.0/16. This is the primary CIDR block for your VPC.

You can create a VPC within your chosen AWS Region and it will be available across all Availability Zones within that Region.

To add a new [subnet](https://docs.aws.amazon.com/vpc/latest/userguide/VPC_Subnets.html) to your VPC, you must specify an IPv4 CIDR block for the subnet from the range of your VPC. You can specify the Availability Zone in which you want the subnet to reside. You can have multiple subnets in the same zone but a single subnet cannot span across multiple zones.

To provide future flexibility, we recommend that your subnet and connectivity design support all of the available Availability Zones in your account within the Region, regardless of the number of zones that you initially plan to use within a Region.

### Latency across Availability Zones
<a name="arch-guide-cross-az-latency"></a>

All Availability Zones (AZs) are interconnected with high-bandwidth, low-latency networking, over fully redundant and dedicated metro fiber. This results in single-digit millisecond latency between resources in different Availability Zones in the same Region.

For high availability, we recommend deploying production SAP workloads across multiple Availability Zones, including the SAP Application Server Layer. If you have SAP transactions or batch jobs that involve significant database calls, we recommend that you run these transactions on SAP Application Servers located in the same Availability Zone as the database and that you use SAP Logon Groups (SMLG) for end users and batch server group (SM61) for background processing jobs. This ensures that latency-sensitive parts of the SAP workload run on the correct application servers.

### On premises to AWS connectivity
<a name="arch-guide-on-premises-to-aws-connectivity"></a>

You can connect to your VPC through a Site-to-Site [virtual private network](https://docs.aws.amazon.com/vpn/latest/s2svpn/VPC_VPN.html) (VPN) or [AWS Direct Connect](https://docs.aws.amazon.com/directconnect/latest/UserGuide/Welcome.html) from on premises. AWS Direct Connect provides and [SLA](https://aws.amazon.com/directconnect/sla) of up to 99.99% and Site-to-Site VPN provides an [SLA](https://aws.amazon.com/vpn/site-to-site-vpn-sla) of 99.95% 

Site-to-Site VPN connections are to specific Regions. For Direct Connect-based connections, [Direct Connect Gateway](https://docs.aws.amazon.com/directconnect/latest/UserGuide/direct-connect-gateways.html) allows you to connect to multiple Regions.

When establishing connectivity to AWS from on premises, ensure that you have resilient connections either through the use of multiple Direct Connect Links, multiple VPN connections, or a combination of the two.

The [AWS Direct Connect Resiliency Toolkit](https://docs.aws.amazon.com/directconnect/latest/UserGuide/resilency_toolkit.html) provides a connection wizard with multiple resiliency models. These models help you order dedicated connections to achieve your SLA objective.

### VPC endpoints
<a name="arch-guide-vpc-endpoints"></a>

A [VPC endpoint](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints.html) privately connects your VPC to supported AWS services and VPC endpoint services powered by [AWS PrivateLink](https://aws.amazon.com/privatelink/). It doesn’t require internet access via an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection. Instances in your VPC do not require public IP addresses to communicate with resources in the AWS service. Traffic between your VPC and other services does not leave the Amazon network.

VPC endpoints are available for all of the core AWS services that are required to support an SAP-based workload, including Amazon EC2 API, Amazon S3, and Amazon Elastic File System.

### Cross-Region peering
<a name="arch-guide-cross-region-peering"></a>

 [Amazon Virtual Private Cloud](https://docs.aws.amazon.com/vpc/latest/userguide) (Amazon VPC) supports [Inter-Region peering](https://docs.aws.amazon.com/vpc/latest/peering/what-is-vpc-peering.html) between two VPCs in different Regions. This can be used to allow network traffic, such as database replication traffic to flow between two [Amazon EC2](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/concepts.html) instances in different Regions. Inter-Region peering incurs data transfer costs.

 [AWS Transit Gateway](https://docs.aws.amazon.com/vpc/latest/tgw/what-is-transit-gateway.html) is a network transit hub that you can use to interconnect your virtual private clouds (VPC) within an AWS Region to other VPCs in other AWS Regions and to on premises networks using AWS Direct Connect or VPN. Use of Transit Gateway will incur [Transit Gateway costs.](https://aws.amazon.com/transit-gateway/pricing) AWS Transit Gateway provides an [SLA](https://aws.amazon.com/transit-gateway/sla) of 99.95% within a Region.

### Load balancing
<a name="arch-guide-load-balancing"></a>

 [Elastic Load Balancing](https://docs.aws.amazon.com/elasticloadbalancing/index.html) supports four types of load balancing: Application Load Balancers, Network Load Balancers, Gateway Load Balancers, and Classic Load Balancers.

A [Network Load Balancer ](https://docs.aws.amazon.com/elasticloadbalancing/latest/network/introduction.html)can be used to support a high-availability deployment of SAP Web Dispatchers and/or SAP Central Services across multiple Availability Zones. For more details, see [Overlay IP Routing with Network Load Balancer](https://docs.aws.amazon.com/sap/latest/sap-hana/sap-oip-overlay-ip-routing-with-network-load-balancer.html).

A * **load balancer** * serves as the single point of contact for clients. The load balancer distributes incoming traffic across multiple targets, such as Amazon EC2 instances.

A ** *listener* ** checks for connection requests from clients, using the protocol and port that you configure, and forwards requests to a target group.

Each * **target group** * routes requests to one or more registered targets, such as Amazon EC2 instances, using the TCP protocol and the specified port number. You can configure health checks on a per target group basis. Health checks are performed on all targets registered to a target group that is specified in a listener rule for your load balancer.

For TCP traffic, the Network Load Balancer selects a target using a flow hash algorithm based on the protocol, source IP address, source port, destination IP address, destination port, and TCP sequence number. Each individual TCP connection is routed to a single target for the life of the connection.

### DNS
<a name="arch-guide-dns"></a>

 [Amazon Route 53](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/Welcome.html) is a highly available and scalable Domain Name System (DNS) web service. You can use Route 53 to perform three main functions in any combination: domain registration, DNS routing, and health checking. Route 53 offers an [SLA](https://aws.amazon.com/route53/sla) of 100%. 

 [Amazon Route 53 Resolver](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/resolver.html) provides a set of features that enable bi-directional querying between on premises and AWS over private connections.

## Storage
<a name="arch-guide-storage"></a>

### Object storage
<a name="arch-guide-object-storage"></a>

 [Amazon Simple Storage Service](https://docs.aws.amazon.com/AmazonS3/latest/dev/Welcome.html) (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance. Amazon S3 is a Regional service across all Availability Zones within a Region and is designed for 99.999999999% (11 9’s) of durability and an [SLA](https://aws.amazon.com/s3/sla) of 99.9%.

To protect against data loss, you can perform backups (such as database backups or file backups) to Amazon S3. Additionally, [Amazon EBS Snapshots](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSSnapshots.html) and [Amazon Machine Images](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AMIs.html) (AMIs) are stored in Amazon S3.

Amazon S3 Replication enables automatic, asynchronous copying of objects across Amazon S3 buckets. Buckets that are configured for object replication can be owned by the same AWS account or by different accounts.

#### Amazon S3 replication
<a name="arch-guide-s3-replication"></a>

You can replicate objects between the same or different AWS Regions.
+ Cross-Region replication (CRR) is used to copy objects across Amazon S3 buckets in different AWS Regions.
+ Same-Region replication (SRR) is used to copy objects across Amazon S3 buckets in the same AWS Region.

Cross-Region replication incurs the following [costs](https://aws.amazon.com/s3/pricing/):
+ Data Transfer charges for the data transferred between the first and second AWS Regions
+ Amazon S3 charges for the data stored in Amazon S3 in the two different AWS Regions

Additionally, you can enable [Amazon S3 Replication Time Control](https://docs.aws.amazon.com/AmazonS3/latest/dev/replication-time-control.html) with cross-Region replication. Amazon S3 Replication Time Control (Amazon S3 RTC) helps you meet compliance or business requirements for data replication and provides visibility into Amazon S3 replication times. Amazon S3 RTC replicates most objects that you upload to Amazon S3 in seconds, and 99.99 percent of those objects within 15 minutes.

Amazon S3 RTC incurs the following costs in addition to the costs listed above for cross-Region replication:
+ Amazon S3 RTC Management Feature - [priced](https://aws.amazon.com/s3/pricing) per GB
+ Amazon CloudWatch Amazon S3 Metrics - [priced](https://aws.amazon.com/cloudwatch/pricing) by number of metrics

Same-Region replication incurs the following [costs](https://aws.amazon.com/s3/pricing):
+ Charges for the data stored in Amazon S3 

### Block storage
<a name="arch-guide-block-storage"></a>

Amazon Elastic Block Store ([Amazon EBS](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AmazonEBS.html)) provides block level storage volumes for use with Amazon EC2 instances. Amazon EBS volumes behave like raw, unformatted block devices. You can mount these volumes as devices on your instances. You can create a file system on top of these volumes or use them in any way that you would use a block device (like a hard drive). You can dynamically change the configuration of a volume that’s attached to an instance.

Amazon EBS volumes are placed in a specific Availability Zone where they are automatically replicated to protect you from the failure of a single component. All Amazon EBS volume types offer durable snapshot capabilities and are designed for [99.999% availability per volume](https://aws.amazon.com/ebs/features/#Amazon_EBS_availability_and_durability)  and [99.99% service availability](https://aws.amazon.com/compute/sla) with Multi-AZ configuration. The use of a database replication capability, block level replication solution or [Amazon EBS Snapshots](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSSnapshots.html) is required to provide durability of the SAP data stored on Amazon EBS across multiple Availability Zones.

Amazon EBS volumes are designed for an annual failure rate (AFR) of between 0.1% - 0.2%, where failure refers to a complete or partial loss of the volume, depending on the size and performance of the volume. This makes Amazon EBS volumes 20 times more reliable than typical commodity disk drives, which fail with an AFR of around 4%. For example, if you have 1,000 Amazon EBS volumes running for 1 year, you should expect 1 to 2 will have a failure.

Amazon EBS offers a number of different [volume types](https://aws.amazon.com/ebs/features/#Amazon_EBS_volume_types). It must be used for the SAP database-related data, General Purpose SSD (gp2) or Provisioned IOPS SSD (io1) must be used. The throughput and IOPS requirement will determine if gp2 or io1 is required.

Amazon EBS Multi-Attach enables you to attach a single Provisioned IOPS SSD (io1) volume to up to 16 [AWS Nitro-based instances](https://aws.amazon.com/ec2/nitro) that are in the same Availability Zone. You can attach multiple Multi-Attach enabled volumes to an instance or set of instances. Each instance to which the volume is attached has full read and write permission to the shared volume. Multi-Attach enabled volumes do not support I/O fencing. I/O fencing protocols control write access in a shared storage environment to maintain data consistency. Your applications must provide write ordering for the attached instances to maintain data consistency.

#### Amazon EBS snapshots
<a name="arch-guide-ebs-snapshots"></a>

You can back up the data on your Amazon EBS volumes to Amazon S3 by taking point-in-time [snapshots](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSSnapshots.html). Snapshots are *incremental* backups, which means that only the blocks on the device that have changed after your most recent snapshot are saved. This minimizes the time required to create the snapshot and saves on storage costs by not duplicating data. When you delete a snapshot, only the data unique to that snapshot is removed. Each snapshot contains all of the information that is needed to restore your data (from the moment when the snapshot was taken) to a new Amazon EBS volume.

Amazon EBS Snapshots can be [copied](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-copy-snapshot.html) (replicated) to a different Region and/or shared with a different AWS Account.

Copying Snapshots across Regions incurs the following [costs](https://aws.amazon.com/ebs/pricing):
+ Data Transfer charges for the data transferred between the first and second AWS Regions
+ Amazon EBS Snapshot charges for the data stored in Amazon S3 in the two different AWS Regions

#### Restoring snapshots
<a name="arch-guide-restoring-snapshots"></a>

New volumes created from existing Amazon EBS snapshots load lazily in the background. This means that after a volume is created from a snapshot, there is no need to wait for all of the data to transfer from Amazon S3 to your Amazon EBS volume before your attached instance can start accessing the volume and all its data.

This preliminary action takes time and can significantly increase the latency of I/O operations. If your instance accesses data that hasn’t yet been loaded, the volume immediately downloads the requested data from Amazon S3, and then continues loading the rest of the volume data in the background.

#### Fast snapshot restore
<a name="arch-guide-fast-snapshot-restore"></a>

Amazon EBS [fast snapshot restore](https://docs.aws.amazon.com/en_us/AWSEC2/latest/UserGuide/ebs-fast-snapshot-restore.html) enables you to create a volume from a snapshot that is fully-initialized at creation. This eliminates the latency of I/O operations on a block when it is accessed for the first time. Volumes created using fast snapshot restore instantly deliver all of their provisioned performance. To use fast snapshot restore, enable it for specific snapshots in specific Availability Zones. Fast Snapshot Restore is [charged](https://aws.amazon.com/ebs/pricing) in Data Services Unit-Hours (DSUs) for each zone in which it is enabled. DSUs are billed per minute with a 1-hour minimum.

### File storage
<a name="arch-guide-file-storage"></a>

#### Amazon EFS
<a name="arch-guide-amazon-efs"></a>

 [Amazon Elastic File System](https://docs.aws.amazon.com/efs/latest/ug/whatisefs.html) (Amazon EFS) provides scalable NFS version 4 based file storage for use with Linux-based Amazon EC2 (Windows-based Amazon EC2 instances do not support Amazon EFS). The service is designed to be highly scalable, available, and durable. Amazon EFS file systems store data and metadata across multiple Availability Zones in an AWS Region. Amazon EFS offers an [SLA](https://aws.amazon.com/efs/sla) of 99.99%.

Amazon EFS file systems can be shared across [accounts and VPCs](https://docs.aws.amazon.com/efs/latest/ug/manage-fs-access-vpc-peering.html) within the same Region or a different Region, enabling Amazon EFS to be an ideal choice for SAP global file system (/sapmnt) and SAP transport directory (/usr/sap/trans).

 [AWS DataSync](https://aws.amazon.com/datasync) supports [Amazon EFS to Amazon EFS transfer](https://aws.amazon.com/about-aws/whats-new/2019/05/aws-datasync-now-supports-efs-to-efs-transfer) between Regions and different AWS Accounts, allowing the replication of key SAP file based data across Regions. [AWS Backup](https://docs.aws.amazon.com/aws-backup/latest/devguide/how-it-works-cross-region-replication.html) can also be used to replicate backups of Amazon EFS file systems across Regions.

#### Amazon FSx
<a name="arch-guide-amazon-fsx"></a>

 [Amazon FSx for Windows File Server](https://docs.aws.amazon.com/fsx/latest/WindowsGuide/getting-started.html) provides fully managed Microsoft Windows file servers, backed by a fully native Windows file system. Amazon FSx offers an [SLA](https://aws.amazon.com/fsx/sla) of 99.9% and supports both Single-AZ and Multi-AZ File Systems.

With Single-AZ file systems, Amazon FSx automatically replicates your data within an Availability Zone, continuously monitors for hardware failures and automatically replaces infrastructure components in the event of a failure. Amazon FSx also takes highly durable backups of your file system daily using Windows’ Volume Shadow Copy Service that are stored in Amazon S3. You can take additional backups at any point.

Multi-AZ file systems support all the availability and durability features of Single-AZ file systems. In addition, they are designed to provide continuous availability to data, even when an Availability Zone is unavailable. In a Multi-AZ deployment, Amazon FSx automatically provisions and maintains a standby file server in a different zone. Any changes written to disk in your file system are synchronously replicated across Availability Zones to the standby.

Amazon FSx File systems can be [shared across Accounts and VPCs](https://docs.aws.amazon.com/fsx/latest/WindowsGuide/supported-fsx-clients.html) within the same Region or a different Region, enabling Amazon FSx to be used not only for the SAP Global File System but also the SAP Transport Directory.

Additionally, Amazon FSx can also be used for providing [Continuously Available (CA) File Shares for Microsoft SQL Server](https://docs.aws.amazon.com/fsx/latest/WindowsGuide/sql-server.html).

## Monitoring and audit
<a name="arch-guide-monitoring-and-audit"></a>

### Amazon CloudWatch
<a name="arch-guide-amazon-cloudwatch"></a>

Amazon CloudWatch is a monitoring and observability service built for DevOps engineers, developers, site reliability engineers (SREs), and IT managers. CloudWatch provides you with data and actionable insights to monitor your applications, respond to system-wide performance changes, optimize resource utilization, and get a unified view of operational health. CloudWatch collects monitoring and operational data in the form of logs, metrics, and events, providing you with a unified view of AWS resources, applications, and services that run on AWS and on-premises servers. You can use CloudWatch to detect anomalous behavior in your environments, set alarms, visualize logs and metrics side by side, take automated actions, troubleshoot issues, and discover insights to keep your applications running smoothly.

### AWS CloudTrail
<a name="arch-guide-aws-cloudtrail"></a>

 AWS CloudTrail is a service that enables governance, compliance, operational auditing, and risk auditing of your AWS account. With CloudTrail, you can log, continuously monitor, and retain account activity related to actions across your AWS infrastructure. CloudTrail provides event history of your AWS account activity, including actions taken through the AWS Management Console, AWS SDKs, command line tools, and other AWS services. This event history simplifies security analysis, resource change tracking, and troubleshooting. In addition, you can use CloudTrail to detect unusual activity in your AWS accounts. These capabilities help simplify operational analysis and troubleshooting.

# Architecture patterns
<a name="arch-guide-architecture-patterns"></a>

In this section, we elaborate on the architecture patterns that you can select based on your availability and recovery requirements. We also analyze failure scenarios that can help you select the right pattern for your SAP system(s).

# Failure scenarios
<a name="arch-guide-failure-scenarios"></a>

For the failure scenarios below, the primary consideration is the physical unavailability of the compute and/or storage capacity within the Availability Zones.

## Availability Zone failure
<a name="arch-guide-availability-zone-failure"></a>

An Availability Zone failure can be caused by a significant availability degradation of one or more AWS services utilized by your resources within that Availability Zone. For example:
+ Several Amazon EC2 instances have failed with [System Status Check errors](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-system-instance-status-check.html) or are unreachable and cannot be restarted.
+ Several Amazon Elastic Block Store (Amazon EBS) volumes with [Volume Status Check errors](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-volume-status.html#monitoring-volume-checks) have failed.

## Amazon Elastic Block Store failure
<a name="arch-guide-amazon-elastic-block-store-failure"></a>

Loss of one or more Amazon EBS volumes attached to a single Amazon EC2 instance may result in the unavailability of a critical component (i.e. the database) of the SAP system.

## Amazon EC2 failure
<a name="arch-guide-ec2-failure"></a>

Loss of a single Amazon EC2 instance may result in the unavailability of a critical component (i.e. the database or SAP Central Services) of the SAP system.

## Logical data loss
<a name="arch-guide-logical-data-loss"></a>

You should also consider the potential for logical data loss where the underlying hardware capacity still exists but the primary copies of the data have been corrupted or lost. This data loss could be due to malicious activity within your AWS account or due to human error.

To protect against logical data loss, it is recommended that regular copies of the data are backed up to an Amazon S3 bucket. This bucket is replicated (using [Single-Region or Cross-Region replication](arch-guide-architecture-guidelines-and-decisions.md#arch-guide-s3-replication)) to another Amazon S3 bucket owned by a separate AWS account. With the appropriate AWS Identity and Access Management (IAM) controls between the two AWS accounts, this strategy ensures that not all copies of the data are lost due to malicious activity or human error.

# Patterns
<a name="arch-guide-patterns"></a>

In this section, we examine the architecture patterns available to handle the failure scenarios detailed above.

There are two key parameters to consider when selecting a pattern to meet your organization’s specific business requirements:
+ Availability of compute for the SAP single points of failure
+ Availability of the SAP data persisted on Amazon EBS

These parameters determine the time taken to recover from a failure scenario, that is, the time taken by your SAP system to return to service.

 *Types of architecture patterns* 

The architecture patterns are grouped into single Region and multi-Region patterns. The distinguishing factor would be:

1. If you require the data to only reside in a specific geographical location (AWS Region) at all times (for example, data residency requirements).

   or

1. If you require the data to reside in two specific geographical locations (AWS Regions) at all times (for example, two copies of SAP data must reside at least 500 miles apart for compliance).

If your production systems are critical to your business and you require minimal downtime in the event of failure, you should select a Multi-Region pattern to ensure that your production systems are highly available at all times. When deploying a Multi-Region pattern, you can benefit from using an automated approach (such as, Cluster solution) for fail over between Availability Zones to minimize the overall downtime and remove the need for human intervention. Multi-Region patterns not only provide high availability but also disaster recovery, thereby lowering overall costs.

# Single Region architecture patterns
<a name="arch-guide-single-region-architecture-patterns"></a>

Select a single Region pattern if:
+ You require the data to reside only in a specific geographical Region (AWS Region) at all times
+ You want to avoid the [potential network latency](arch-guide-architecture-guidelines-and-decisions.md#arch-guide-multi-region-considerations) considerations associated with a Multi-Region approach
+ You want to avoid the cost implications or differences associated with a Multi-Region approach including:
  +  [AWS service pricing in different AWS Regions](arch-guide-architecture-guidelines-and-decisions.md#arch-guide-selecting-the-aws-regions) 
  +  [Cross-Region data transfer costs](arch-guide-architecture-guidelines-and-decisions.md#arch-guide-cross-regional-data-transfer) 

## Pattern 1: A single Region with two AZs for production
<a name="arch-guide-pattern-1-a-single-region-with-a-single-az-for-production"></a>

 **Figure 7: A single Region with two Availability Zones for production** 

![\[A single Region with two Availability Zones for production\]](http://docs.aws.amazon.com/sap/latest/general/images/arch-guidance-pattern-1.png)


In this pattern, you deploy all your production systems across two Availability Zones. The compute deployed for the production SAP database and central services tiers are the same size in both Availability Zones, with automated fail over in the event of a zone failure. The compute required for the SAP application tier is split 50/50 between two zones. Your non-production systems are **not** an equivalent size to your production and are deployed in the same zones or a different Availability Zone within the Region.

 **Select this pattern if:** 
+ You require a defined time window to complete recovery of production as well as assurance of the availability of compute capacity in another Availability Zone for the production SAP database and central services tiers.
+ You can accept the additional cost of deploying the required compute and storage for production SAP database and central services tiers across two Availability Zones.
+ Your non-production environment is not of equivalent size as production and therefore cannot be used as sacrificial capacity for production in the event of an Availability Zone failure or significant Amazon EC2 service degradation.
+ You can accept data replication across Availability Zones (database replication capability or a block level replication solution required) and the associated cost.
+ You can accept that automated fail over between Availability Zones requires a third-party cluster solution.
+ You can accept the variable time duration required (including any delay in availability of the required compute capacity in the remaining Availability Zones) to return the application tier to 100% capacity in the event of a zone failure.

 **Key design principles** 
+ 100% compute capacity deployed in Availability Zone 1 and Availability Zone 2 for production SAP database and central services tiers.
+ Compute capacity is deployed in Availability Zone 1 and Availability Zone 2 for production application tier (Active/Active). In the event of an Availability Zone failure, the application tier needs to be scaled to return to 100% capacity within the remaining zone.
+ The SAP Database is persisted on Amazon EBS in two Availability Zones using either a database replication capability or a block level replication solution.
+ Amazon EC2 auto recovery is configured for all instances to protect against underlying hardware failure, with the exception of instances protected by a third-party cluster solution.
+ Amazon EFS is used for the SAP Global File Systems.
+ SAP Database is backed up regularly to Amazon S3.
+ Amazon S3 [single-Region replication](arch-guide-architecture-guidelines-and-decisions.md#arch-guide-s3-replication) is configured to protect [logical data loss](arch-guide-failure-scenarios.md#arch-guide-logical-data-loss).
+ Amazon Machine Image/Amazon EBS Snapshots are taken for all servers on a regular basis.

 **Benefits** 
+ Low Mean Time to Recovery (MTTR)
+ Predictable Return to Service (RTS)
+ Ability to protect against significant degradation or total Availability Zone failure through fail over of database and central services tiers to Availability Zone 2
+ No requirement to restore data from Amazon S3 in the event of an Availability Zone or Amazon EBS failure

 **Considerations** 
+ Well documented and tested processes are required for the automated fail over between Availability Zones.
+ Well documented and tested processes are required for maintaining the automated fail over solution.
+ Well documented and tested processes are required for scaling the AWS resources to return the application tier to required capacity in the event of an Availability Zone failure or significant Amazon EC2 service degradation.

## Pattern 2: A single Region with two AZs for production and production sized non-production in a third AZ
<a name="arch-guide-pattern-2-a-single-region-with-one-az-for-production-and-another-az-for-non-production"></a>

 **Figure 8: A single Region with two Availability Zones for production and production sized non-production in a third Availability Zone** 

![\[A single Region with two Availability Zones for production and production sized non-production in a third Availability Zone\]](http://docs.aws.amazon.com/sap/latest/general/images/arch-guidance-pattern-2.png)


In this pattern, you deploy all your production systems across two Availability Zones. The compute deployed for the production SAP database and central services tiers are the same size in both Availability Zones, with automated fail over in the event of a zone failure. The compute required for the SAP application tier is split 50/50 between two Availability Zones. Your non-production systems are an equivalent size to your production and deployed in a third Availability Zone. In the event of an Availability Zone failure where your production systems are deployed, the non-production capacity is reallocated to enable production to be returned to a Multi-AZ pattern.

 **Select this pattern if:** 
+ You require the ability to continue to have a Multi-AZ configuration for production in the event of an Availability Zone failure within the Region.
+ You require a defined time window to complete recovery of production and assurance of the availability of the compute capacity in another Availability Zone for the production SAP database and central services tiers.
+ You can accept the additional cost of deploying the required compute and storage for production SAP database and central services tiers across two Availability Zones.
+ You can accept data replication across Availability Zones (database replication capability or a block level replication solution required) and the associated cost.
+ You can accept that automated fail over between Availability Zones requires a third-party cluster solution.
+ You can accept the variable time duration required (including any delay in availability of the required compute capacity in the remaining Availability Zones) to return the application tier to 100% capacity in the event of an Availability Zone failure.

 **Key design principles** 
+ 100% compute capacity is deployed in Availability Zone 1 and Availability Zone 2 for production SAP database and central services tiers.
+ 100% production compute capacity (database and central services) is deployed in the third Availability Zone for use by non-production in normal operations.
+ Compute capacity is deployed in Availability Zone 1 and Availability Zone 2 for production application tier (Active/Active). In the event of an Availability Zone failure, the application tier needs to be scaled to return to 100% capacity within the remaining zone.
+  [Amazon EC2 auto recovery](arch-guide-architecture-guidelines-and-decisions.md#arch-guide-ec2-auto-recovery) is configured for all instances to protect against underlying hardware failure, with the exception of instances protected by a third-party cluster solution.
+ The SAP Database is persisted on Amazon EBS in two Availability Zones using either a database replication capability or a block level replication solution.
+ Amazon EFS is used for the SAP Global File Systems.
+ SAP Database is backed up regularly to Amazon S3.
+ Amazon S3 [single-Region replication](arch-guide-architecture-guidelines-and-decisions.md#arch-guide-s3-replication) is configured to protect against [logical data loss](arch-guide-failure-scenarios.md#arch-guide-logical-data-loss).
+ Amazon Machine Image/Amazon EBS Snapshots for all servers are taken on a regular basis.

 **Benefits** 
+ Low Mean Time to Recovery (MTTR)
+ Predictable Return to Service (RTS)
+ Ability to protect against significant degradation or total Availability Zone failure through fail over of database and central services tiers to Availability Zone 2
+ No requirement to restore data from Amazon S3 in the event of an Availability Zone failure or Amazon EBS failure
+ Option for data to be persisted on Amazon EBS in three different Availability Zones, dependent on capabilities of database or block level replication solution
+ Use of non-production compute capacity to return to production run across two Availability Zones in the event of a significant degradation or total Availability Zone failure

 **Considerations** 
+ Well documented and tested processes are required for the automated fail over between Availability Zones.
+ Well documented and tested processes are required for maintaining the automated fail over solution.
+ Well documented and tested processes are required for scaling the AWS resources to return the application tier to required capacity in the event of an Availability Zone failure or significant Amazon EC2 service degradation.
+ Well documented and tested processes are required for re-allocating the compute capacity from non-production to return production to run across two Availability Zones in the event of an Availability Zone failure impacting production.

## Pattern 3: A single Region with one AZ for production and another AZ for non-production
<a name="arch-guide-pattern-3-a-single-region-with-two-azs-for-production"></a>

 **Figure 9: A single Region with one Availability Zone for production and another Availability Zone for non-production** 

![\[A single Region with one Availability Zone for production and another Availability Zone for non-production\]](http://docs.aws.amazon.com/sap/latest/general/images/arch-guidance-pattern-3.png)


In this pattern, you deploy all your production systems in one Availability Zone and all your non-production systems in another Availability Zone. Your non-production systems are an equivalent size to your production.

 **Select this pattern if:** 
+ You require a defined time window to complete recovery of production and assurance of the availability of compute capacity in another Availability Zone for the SAP database and central services tiers.
+ You can accept the additional time required to re-allocate compute capacity from non-production to production as part of the overall time window to recover production.
+ You can accept the time required to restore data to Amazon EBS from Amazon S3 in another Availability Zone as part of the overall time window to recover production.
+ You can accept the variable time duration required to return the application tier to 100% capacity following an Availability Zone failure (including any delay in availability of the required compute capacity in the remaining Availability Zones).
+ You can accept a period of time where there is only one set of computes deployed for the production SAP database and central services tiers in the event of an Availability Zone failure or significant Amazon EC2 service degradation.

 **Key design principles** 
+ 100% compute capacity is deployed in Availability Zone 1 for production SAP database and central services tiers.
+ 100% compute capacity is deployed in Availability Zone 1 for production SAP application tier.
+ 100% of production compute capacity (SAP database and central services) is deployed in Availability Zone 2 for use by non-production in normal operations.
+ Amazon EC2 auto recovery is configured for all instances to protect against underlying hardware failure.
+ The SAP database is persisted on Amazon EBS in a single Availability Zone only and not replicated on another Availability Zone.
+ Amazon EFS is used for the SAP Global File Systems.
+ SAP Database Data is backed up regularly to Amazon S3.
+ Amazon S3 single-Region replication is configured to protect against logical data loss.
+ Amazon Machine Image/Amazon EBS Snapshots are taken for all servers on a regular basis.

 **Benefits** 
+ Cost optimized through use of non-production capacity in the event of production Availability Zone failure
+ Required compute capacity deployed in two Availability Zones to allow a more predictable recovery time duration

 **Considerations** 
+ Well documented and tested processes for re-allocating the required compute capacity from non-production to production and restoring the data in a different Availability Zone are required to ensure recoverability.
+ There may be loss of non-production environments in the event of an Availability Zone failure impacting production.
+ Due to the lack of high availability across two Availability Zones, the time required to recover production in the event of compute failure or Availability Zone failure increases.

## Pattern 4: A single Region with a single AZ for production
<a name="arch-guide-pattern-4-a-single-region-with-two-azs-for-production-and-production-sized-non-production-in-a-3rd-az"></a>

 **Figure 10: A single Region with a single Availability Zone for production** 

![\[A single Region with a single Availability Zone for production\]](http://docs.aws.amazon.com/sap/latest/general/images/arch-guidance-pattern-4.png)


In this pattern, you deploy all your production systems in one Availability Zone and all your non-production systems in either the same Availability Zone or another Availability Zone. Your non-production systems are **not** a similar size to your production.**\$1**\$1

 **Select this pattern if:** 
+ In the event of an Availability Zone failure or significant Amazon EC2 service degradation, you can accept the risks related to the variable time duration required (including any delay in availability of the required compute capacity in the remaining Availability Zones) to re-create the AWS resources in a different Availability Zone and restore the persistent data to Amazon EBS.
+ You want to avoid the cost implications with a Multi-AZ approach and accept the related risks of downtime of your production SAP systems.

 **Key design principles** 
+ 100% compute capacity is deployed in Availability Zone 1 for production SAP database and central services tiers.
+ 100% compute capacity is deployed in Availability Zone 1 for production SAP application tier.
+ Amazon EC2 is configured for all instances to protect against underlying hardware failure.
+ Deployed non-production compute capacity is less than 100% the compute capacity deployed for production SAP database and central services tiers.
+ The SAP database is persisted on Amazon EBS in a single Availability Zone only and not replicated on another Availability Zone.
+ Amazon EFS is used for the SAP Global File Systems.
+ SAP Database is backed up regularly to Amazon S3.
+ Amazon S3 single-Region replication is configured to protect against logical data loss.
+ Amazon Machine Image/Amazon EBS Snapshots for all servers are taken on a regular basis.

 **Benefits** 
+ Lowest cost
+ Simplest design
+ Simplest operation

 **Considerations** 
+ Well documented and tested processes for scaling the AWS resources and restoring data in a different Availability Zone are required to ensure recoverability.

# Multi-Region Architecture Patterns
<a name="arch-guide-multi-region-architecture-patterns"></a>

You should select a multi-Region architecture if you require the following:
+ You require the data to reside in two specific geographical AWS Regions at all times.
+ You can accept the potential network latency considerations associated with a multi-Region approach.
+ You can accept the increased complexity associated with multi-Region approach.
+ You can accept the cost implications / differences associated with a multi-Region approach including:
  +  [AWS service pricing (e.g. Amazon EC2)](arch-guide-architecture-guidelines-and-decisions.md#arch-guide-selecting-the-aws-regions) in different AWS Regions
  +  [Cross-Region data transfer costs](arch-guide-architecture-guidelines-and-decisions.md#arch-guide-cross-regional-data-transfer) 
+ Additional compute and/or storage costs in the second Region.

## Pattern 5: A primary Region with two AZs for production and a secondary Region containing a replica of backups/AMIs
<a name="arch-guide-pattern-5-a-primary-region-with-one-az-for-production-and-secondary-region-containing-a-replica-of-backups-amis"></a>

 **Figure 11: A primary Region with two Availability Zones for production and a secondary Region containing a replica of backups/AMIs** 

![\[A primary Region with two Availability Zones for production and a secondary Region containing a replica of backups/AMIs\]](http://docs.aws.amazon.com/sap/latest/general/images/arch-guidance-pattern-5.png)


In this pattern, you deploy your production system across two Availability Zones in the primary Region. The compute deployed for the production SAP database and central services tiers are the same size in both Availability Zones with automated fail over in the event of an Availability Zone failure. The compute required for the SAP application tier is split 50/50 between two Availability Zones. Additionally, the production database backups stored in Amazon S3, Amazon EBS Snapshots, and Amazon Machine Images are replicated on the secondary Region. In the event of a complete Region failure, the production systems would be restored from the last set of backups in the second Region.

 **Select this pattern if:** 
+ You require a defined time window to complete recovery of production and assurance of the availability of compute capacity in another Availability Zone within the primary Region for the production SAP database and central services tiers.
+ You can can accept the additional cost of deploying the required compute and storage for production SAP database and central services tiers across two Availability Zones within the primary Region.
+ You can accept the cross-Availability Zone related data transfer costs for data replication.
+ You can accept that automated fail over between Availability Zones requires a third-party cluster solution.
+ You can allow for a period of time where there is only one set of computes deployed for the SAP database and central services in the event of an Availability Zone failure or significant Amazon EC2 failure.
+ You can accept that data replication across Availability Zones requires either a database replication capability or a block level replication solution.
+ You can accept the variable time duration required (including any delay in availability of the required compute capacity in the remaining Availability Zones) to return the application tier to 100% capacity.
+ You can accept the variable time duration required to complete recovery of production in the event of a Region failure.
+ You can accept the increased complexity and costs associated with multi-Region approach.
+ You can accept that manual actions are required to restore production in the second Region.

 **Key design principles** 
+ 100% compute capacity is deployed in Availability Zone 1 and Availability Zone 2 for production SAP database and central services tiers.
+ Compute capacity is deployed in Availability Zone 1 and Availability Zone 2 for production SAP application tier (Active/Active) and needs to be scaled in the event of an Availability Zone failure or significant Amazon EC2 service degradation.
+  [Amazon EC2 auto recovery](arch-guide-architecture-guidelines-and-decisions.md#arch-guide-ec2-auto-recovery) is configured for all instances to protect against underlying hardware failure with the exception of instances protected by a third-party cluster solution.
+ The SAP database-related data on Amazon EBS is replicated between Availability Zones using either a database replication capability or a block level replication solution.
+ Amazon EFS is used for the SAP Global File Systems and is replicated on the secondary Region.
+ SAP Database data is backed up regularly to Amazon S3.
+ Amazon Machine Image/Amazon EBS Snapshots are taken for all servers on a regular basis 
+ Amazon S3 Data (database backups), Amazon EBS Snapshots and Amazon Machine Images are replicated/copied to a secondary Region to protect [logical data loss](arch-guide-failure-scenarios.md#arch-guide-logical-data-loss).

 **Benefits** 
+ Low Mean Time to Recovery (MTTR) in the event of Amazon EC2 or Availability Zone failure
+ Predictable Return to Service (RTS) in the event of Amazon EC2 or Availability Zone failure
+ Database-related data persisted on different sets of Amazon EBS volumes in two Availability Zones via database replication capability or a block level replication solution
+ Required compute capacity deployed in two Availability Zones in primary Region
+ No dependency on restoring data from Amazon S3 in the event of an Availability Zone failure in the primary Region
+ Ability to protect against significant degradation or total Availability Zone failure through fail over to Availability Zone 2 of database and central services tiers
+ Ability to protect against significant degradation or total Region failure through fail over to secondary Region

 **Considerations** 
+ Well documented and tested processes are required for the automated fail over between Availability Zones.
+ Well documented and tested processes are required for maintaining the automated fail over solution.
+ Well documented and tested processes are required for scaling the AWS resources to return the application tier to full capacity in the event of an Availability Zone failure or significant Amazon EC2 service degradation.
+ Well documented and tested processes are required for scaling the AWS resources, restoring the data, and moving production to the secondary Region.
+ Higher network latency from your on-premises locations to the secondary AWS Region may impact end user performance.

## Pattern 6: A primary Region with two AZs for production and a secondary Region with compute and storage capacity deployed in a single AZ
<a name="arch-guide-pattern-6-a-primary-region-with-two-azs-for-production-and-secondary-region-containing-a-replica-of-backups-amis"></a>

 **Figure 12: A primary Region with two Availability Zones for production and a secondary Region with compute and storage capacity deployed in a single Availability Zone** 

![\[A primary Region with two Availability Zones for production and a secondary Region with compute and storage capacity deployed in a single Availability Zone\]](http://docs.aws.amazon.com/sap/latest/general/images/arch-guidance-pattern-6.png)


In this pattern, you deploy all of your production systems across two Availability Zones in the primary Region. The compute deployed for the production SAP database and central services tiers are the same size in both Availability Zones with automated fail over in the event of Availability Zone failure. The compute required for the SAP application tier is split 50/50 between two Availability Zones. Your non-production systems are **not** an equivalent size to your production and are deployed in a different Availability Zone within the Region. Additionally, compute capacity is deployed in Availability Zone 1 in secondary Region for production SAP database and central services tiers. The production database is replicated to the secondary Region using a database replication capability or a block level replication solution.

The Production database backups stored in Amazon S3, Amazon EBS Snapshots, and Amazon Machine Images are replicated to the secondary Region. In the event of a complete Region failure, the production systems would be restored in the secondary Region using the replicated data for the database tier and the last set of backups for the SAP central services and application tiers.

 **Select this pattern if:** 
+ You require a defined time window to complete recovery of production and assurance of the availability of compute capacity in another Availability Zone within the primary Region for the production SAP database and central services tiers.
+ You can accept the additional cost of deploying the required compute and storage for production SAP database and central services tiers across two Availability Zones within the primary Region.
+ You can accept the increased cost of deploying the required compute and storage for production SAP database and central services tiers across two Availability Zones in the primary Region.
+ You can accept the cross-Availability Zone related data transfer costs for data replication.
+ You can accept that automated fail over between Availability Zones requires a third-party cluster solution.
+ You can allow for a period of time where there is only one set of computes deployed for the SAP database and central services in the event of an Availability Zone failure or significant Amazon EC2 failure.
+ You can accept that data replication across Availability Zones of the database-related data requires either a database replication capability or a block level replication solution.
+ You can accept the variable time duration required (including any delay in availability of the required compute capacity in the remaining Availability Zones) to return the application tier to 100% capacity.
+ You require a defined time window to complete recovery of production in the event of a Region failure.
+ You can accept the increased complexity and costs associated with multi-Region approach.
+ You require assurance of availability of compute capacity in a single Availability Zone in the secondary Region for the production SAP database and central services tiers.
+ You can accept the increased cost of deploying the required compute and storage for production SAP database and central services tiers across two Availability Zones in one Availability Zone in the secondary Region.
+ You can accept that manual actions are required to fail over between Regions.

 **Key design principles** 
+ 100% compute capacity is deployed in Availability Zone 1 and Availability Zone 2 for production SAP database and central services tiers.
+ 100% compute capacity is deployed in Availability Zone 1 in the secondary Region for production SAP database and central services tiers.
+ Compute capacity is deployed in Availability Zone 1 and Availability Zone 2 for production SAP application tier (Active/Active) and needs to be scaled in the event of an Availability Zone failure or significant Amazon EC2 service degradation.
+  [Amazon EC2 auto recovery](arch-guide-architecture-guidelines-and-decisions.md#arch-guide-ec2-auto-recovery) is configured for all instances to protect against underlying hardware failure with the exception of those instances protected by a third-party cluster solution.
+ The database-related data on Amazon EBS is replicated between Availability Zones using either a database replication capability or a block level replication solution.
+ The SAP database-related data on Amazon EBS is replicated between Regions using either a database replication capability or a block level replication solution.
+ Amazon EFS is used for the SAP Global File Systems and replicated to the secondary Region.
+ SAP database data is backed up regularly to Amazon S3.
+ Amazon Machine Image/Amazon EBS Snapshots are taken for all servers on a regular basis 
+ Amazon S3 data (database backups), Amazon EBS Snapshots, and Amazon Machine Images are replicated/copied to a secondary Region to protect [logical data loss](arch-guide-failure-scenarios.md#arch-guide-logical-data-loss).

 **Benefits** 
+ Low Mean Time to Recovery (MTTR) in the event of an Amazon EC2, Availability Zone or Region failure
+ Predictable Return to Service (RTS)
+ Database-related data persisted on different sets of Amazon EBS volumes in two Availability Zones in primary Region and one set of volumes in an Availability Zone in secondary Region via database replication capability or a block level replication solution
+ Required compute capacity deployed in two Availability Zones in primary Region and one Availability Zone in secondary Region
+ No dependency on restoring data from Amazon S3 in the event of an Availability Zone failure or Region failure
+ Ability to protect against significant degradation or total Availability Zone failure through fail over to Availability Zone 2 of database and central services tiers
+ Ability to protect against significant degradation or total Region failure through fail over to secondary Region

 **Considerations** 
+ Well documented and tested processes are required for the automated fail over between Availability Zones.
+ Well documented and tested processes are required for maintaining the automated fail over solution.
+ Well documented and tested processes are required for scaling the AWS resources to return the application tier to full capacity in the event of an Availability Zone failure or significant Amazon EC2 service degradation.
+ Well documented and tested processes are required for moving production to the secondary Region.
+ Higher network latency from your on-premises locations to the secondary AWS Region may impact end user performance.
+ There is an overhead of maintaining the same software version and patch levels (OS, Database, SAP) across two different Regions.

## Pattern 7: A primary Region with two AZs for production and a secondary Region with compute and storage capacity deployed and data replication across two AZs
<a name="arch-guide-pattern-7-a-primary-region-with-two-azs-for-production-and-secondary-region-with-compute-and-storage-capacity-deployed-in-a-single-az"></a>

 **Figure 13: A primary Region with two Availability Zones for production and a secondary Region with compute and storage capacity deployed and data replication across two Availability Zones** 

![\[A primary Region with two Availability Zones for production and a secondary Region with compute and storage capacity deployed and data replication across two Availability Zones\]](http://docs.aws.amazon.com/sap/latest/general/images/arch-guidance-pattern-7.png)


In this pattern, you deploy all of your production systems across two Availability Zones in the primary Region. The compute deployed for the production SAP database and central services tiers are the same size in both Availability Zones with automated fail over in the event of Availability Zone failure. The compute required for the SAP application tier is split 50/50 between two Availability Zone. Additionally, you have compute capacity deployed in Availability Zone 1 and Availability Zone 2 in secondary Region for production SAP database and central services tiers and the production database is replicated to the secondary Region using either a database replication capability or a block level replication solution. The production database backups stored in Amazon S3, Amazon EBS Snapshots, and Amazon Machine Images are replicated on a secondary Region. In the event of a complete Region failure, the production systems would be moved over to the secondary Region manually.

 **Select this pattern if:** 
+ You require a defined time window to complete recovery of production and assurance of the availability of compute capacity in another Availability Zone within the primary Region for the production SAP database and central services tiers.
+ You can accept the additional cost of deploying the required compute and storage for production SAP database and central services tiers across two Availability Zones within the primary Region.
+ You can allow for a period of time where there is only one set of computes deployed for the SAP database and central services in the event of an Availability Zone failure or significant Amazon EC2 failure.
+ You can accept that data replication across Availability Zones of the database-related data requires either a database replication capability or a block level replication solution.
+ You can accept the cross-Availability Zone related data transfer costs for data replication.
+ You can accept that automated fail over between Availability Zones requires a third-party cluster solution.
+ You can accept the variable time duration required (including any delay in availability of the required compute capacity in the remaining Availability Zones) to return the application tier to 100% capacity.
+ You require a defined time window to complete recovery of production in the event of a Region failure.
+ You require assurance of availability of compute capacity in two Availability Zones in the secondary Region for the production SAP database and central services tiers.
+ You can accept the additional cost of deploying the required compute and storage for production SAP database and central services tiers across two Availability Zones in the secondary Region.
+ You can accept the increased complexity and costs associated with multi-Region approach.
+ You can accept that manual actions are required to fail over between Regions.

 **Key design principles** 
+ 100% compute capacity is deployed in Availability Zone 1 and Availability Zone 2 in the primary Region for production SAP database and central services tiers.
+ 100% compute capacity is deployed in Availability Zone 1 and Availability Zone 2 in the secondary Region for production SAP database and central services tiers.
+ Compute capacity is deployed in Availability Zone 1 and Availability Zone 2 in the primary Region for production SAP application tier (Active/Active) and needs to be scaled in the event of an Availability Zone failure or significant Amazon EC2 service degradation.
+ Amazon EC2 auto recovery is configured for all instances to protect against underlying hardware failure with the exception of instances protected by a third-party cluster solution.
+ The SAP database-related data on Amazon EBS is replicated between Availability Zones using either a database replication capability or a block level replication solution.
+ The SAP database-related data on Amazon EBS is replicated between Regions using either a database replication capability or a block level replication solution.
+ Amazon EFS is used for the SAP Global File Systems and replicated on the secondary Region.
+ SAP database data is backed up regularly on Amazon S3.
+ Amazon Machine Image/Amazon EBS Snapshots for all servers are taken on a regular basis.
+ Amazon S3 data (database backups), Amazon EBS Snapshots, and Amazon Machine Images are replicated/copied to a secondary Region to protect [logical data loss](arch-guide-failure-scenarios.md#arch-guide-logical-data-loss).

 **Benefits** 
+ Low Mean Time to Recovery (MTTR) in the event of Amazon EC2, Availability Zone or Region failure
+ Predictable Return to Service (RTS)
+ Database-related data persisted on different sets of Amazon EBS volumes in two Availability Zones in the primary Region and different sets of Amazon EBS volumes in two Availability Zones in the secondary Region via database replication capability or a block level replication solution
+ Required compute capacity deployed in two Availability Zones in primary Region and two Availability Zones in secondary Region
+ No dependency on restoring data from Amazon S3 in the event of an Availability Zone or Region failure
+ Ability to protect against significant degradation or total Availability Zone failure through fail over to Availability Zone 2 of database and central services tiers
+ Ability to protect against significant degradation or total Region failure through fail over to secondary Region

 **Considerations** 
+ Well documented and tested processes are required for the automated fail over between Availability Zones.
+ Well documented and tested processes are required for maintaining the automated fail over solution.
+ Well documented and tested processes are required for scaling the AWS resources to return the application tier to full capacity in the event of an Availability Zone failure or significant Amazon EC2 service degradation.
+ Well documented and tested processes are required for moving production to the secondary Region.
+ Higher network latency from your on-premises locations to the secondary AWS Region may impact end user performance.
+ There is an overhead of maintaining the same software version and patch levels (OS, Database, SAP) across two different Regions.

## Pattern 8: A primary Region with one AZ for production and a secondary Region containing a replica of backups/AMIs
<a name="arch-guide-pattern-8-a-primary-region-with-two-azs-for-production-and-secondary-region-with-compute-and-storage-capacity-deployed-and-data-replication-across-two-azs"></a>

 **Figure 14: A primary Region with one Availability Zone for production and a secondary Region containing a replica of backups/AMIs** 

![\[A primary Region with one Availability Zone for production and a secondary Region containing a replica of backups/AMIs\]](http://docs.aws.amazon.com/sap/latest/general/images/arch-guidance-pattern-8.png)


In this pattern, you deploy your production systems in the primary Region in one Availability Zone. Your non-production systems are **not** an equivalent size to your production and are deployed in the same Availability Zones or a different Availability Zone within the Region.

Additionally, the production database backups stored in Amazon S3, Amazon EBS Snapshots, and Amazon Machine Images are replicated to a secondary Region. In the event of a complete Region failure, the production systems would be restored from the last set of backups in the second Region.

 **Select this pattern if:** 
+ In the event of an Availability Zone failure or significant Amazon EC2 service degradation, you can accept the risks related to the variable time duration required (including any delay in availability of the required compute capacity in the remaining Availability Zones) to re-create the AWS resources in a different Availability Zone and restore the persistent data to Amazon EBS.
+ You can accept the risks related to variable time duration required to complete recovery of production in the event of a Region failure.
+ You want to avoid the cost implications with a Multi-AZ approach and accept the related risks of downtime of your production SAP systems.
+ You can accept the increased complexity and costs associated with multi-Region approach.
+ You can accept that manual actions are required to restore production in the secondary Region.

 **Key design principles** 
+ 100% compute capacity is deployed in Availability Zone 1 for production SAP database and central services tiers.
+ 100% compute capacity is deployed in Availability Zone 1 for production SAP application tier.
+  [Amazon EC2 auto recovery](arch-guide-architecture-guidelines-and-decisions.md#arch-guide-ec2-auto-recovery) is configured for all instances to protect against underlying hardware failure.
+ Deployed non-production compute capacity is less than 100% of the compute capacity deployed for production SAP database and central services tiers.
+ The SAP database is persisted on Amazon EBS in a single Availability Zone only and not replicated to another Availability Zone using either a database replication capability or a block level replication solution.
+ Amazon EFS is used for the SAP global file systems.
+ SAP database is backed up regularly to Amazon S3.
+ Amazon S3 is configured to protect against [logical data loss](arch-guide-failure-scenarios.md#arch-guide-logical-data-loss).
+ Amazon Machine Image/Amazon EBS Snapshots are taken for all servers on a regular basis.
+ Amazon S3 data (database backups), Amazon EBS Snapshots, and Amazon Machine Images are replicated/copied to a secondary Region to protect [logical data loss](arch-guide-failure-scenarios.md#arch-guide-logical-data-loss).

 **Benefits** 
+ Reduced cost compared to Multi-AZ
+ Ability to protect against significant degradation or total Region failure through fail over to secondary Region

 **Considerations** 
+ Well documented and tested processes are required for scaling the AWS resources to return the SAP application tier to full capacity in the event of an Availability Zone failure or significant Amazon EC2 service degradation.
+ Well documented and tested processes are required for scaling the AWS resources, restoring the data, and moving production to the secondary Region.
+ Higher network latency from your on-premises locations to the secondary AWS Region may impact end user performance.
+ In the event of compute, Availability Zone or Region failure due to lack of high availability across two Availability Zones, there is an increased time required to recover production.

# Summary
<a name="arch-guide-summary"></a>

The table below summarizes the patterns and their key characteristics.


| \$1 | Single Region | Multi Region | Single AZ Primary | Multi AZ Primary | Single AZ Second Region | Multi AZ Second Region | Prod Capacity in 2nd AZ | Use of non-prod capacity | Cross-Region data replication | 
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | 
|  1  |  Yes  |  No  |  No  |  Yes  |  No  |  No  |  Yes  |  No  |  No  | 
|  2  |  Yes  |  No  |  No  |  Yes  |  No  |  No  |  Yes  |  Yes  |  No  | 
|  3  |  Yes  |  No  |  Yes  |  No  |  No  |  No  |  No  |  Yes  |  No  | 
|  4  |  Yes  |  No  |  Yes  |  No  |  No  |  No  |  No  |  No  |  No  | 
|  5  |  No  |  Yes  |  No  |  Yes  |  Yes  |  No  |  Yes  |  No  |  Yes  | 
|  6  |  No  |  Yes  |  No  |  Yes  |  Yes  |  No  |  Yes  |  No  |  Yes  | 
|  7  |  No  |  Yes  |  No  |  Yes  |  No  |  Yes  |  Yes  |  No  |  Yes  | 
|  8  |  No  |  Yes  |  Yes  |  No  |  Yes  |  No  |  No  |  No  |  Yes  | 

 **Table 1: Summary of patterns** 

With the flexibility and agility of the AWS Cloud, you have the ability to select any of the patterns described in this guide. You can select the pattern that best meets the business requirements for your SAP systems. It saves you from the trouble of selecting the highest requirement and applying it to all production systems.

For example, if you require highly-available compute capacity in another Availability Zone for the production SAP database and central services tiers of your core ERP system and for your BW system, you can accept the variable time duration required to re-create the AWS resources in a different Availability Zone and restore the persistent data. In this case, you would select Pattern 3 for ERP and Pattern 1 for BW to reduce the overall TCO.

If your requirements change over time, it is possible to move to a different pattern without significant re-design. For example, during the earlier phases of an implementation project, you may not require highly-available compute capacity in another Availability Zone but you can deploy the capacity into a second Availability Zone a few weeks before go-live.

You should consider the following when selecting an architecture pattern to run your SAP system in AWS:
+ The geographical residency of the data
+ The impact of your production SAP systems' downtime on your organization
+ The recovery time objective
+ The recovery point objective
+ The cost profile

# SAP on AWS architecture patterns for Microsoft SQL server
<a name="patterns-microsoft"></a>

This document provides information about architecture patterns for deploying SAP workloads in AWS Cloud on Microsoft SQL servers. These patterns offer highly available and resilient implementation options while considering your recovery time and point objectives.

Work backwards from your business requirements to define an approach that meets the availability goals of your SAP systems and data. For each failure scenario, the resiliency requirements, acceptable data loss, and mean time to recover need to be proportionate to the criticality of the component and the supported business applications.

You can customize these patterns for your specific business criteria. You should consider the risk and impact of each failure type, and the cost of mitigation when choosing a pattern.

**Topics**
+ [Patterns](#patterns)
+ [Comparison matrix](#comparison)
+ [Single Region architecture patterns for Microsoft SQL server](single-region.md)
+ [Multi-Region patterns for Microsoft SQL server](multi-region.md)

## Patterns
<a name="patterns"></a>

The architecture patterns are divided into two categories.
+  [Single Region patterns](https://docs.aws.amazon.com/sap/latest/general/single-region.html) 
+  [Multi-Region patterns](https://docs.aws.amazon.com/sap/latest/general/multi-region.html) 

## Comparison matrix
<a name="comparison"></a>

The following table provides a comparison of all the architecture patterns discussed further.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/sap/latest/general/patterns-microsoft.html)

 *\$1To achieve near zero recovery point objective, database replication must be setup in synchronous data commit mode within the same AWS Region.* 

# Single Region architecture patterns for Microsoft SQL server
<a name="single-region"></a>

Single Region architecture patterns help you avoid network latency as your SAP workload components are located in a close proximity within the same Region. Every AWS Region generally has three Availability Zones. For more information, see [AWS Global Infrastructure Map](https://aws.amazon.com/about-aws/global-infrastructure/).

You can choose these patterns when you need to ensure that your SAP data resides within regional boundaries stipulated by the data sovereignty laws.

The following are the two single Region architecture patterns.

**Topics**
+ [Pattern 1: Single Region with two Availability Zones for production](#pattern1)
+ [Pattern 2: Single Region with one Availability Zone for production](#pattern2)

## Pattern 1: Single Region with two Availability Zones for production
<a name="pattern1"></a>

In this pattern, your Microsoft SQL server is deployed across two Availability Zones with AlwaysOn configured on both the instances. The primary and secondary instances are of the same instance type. The secondary instance can be deployed in active/passive or active/active mode. We recommend using the sync mode of replication for the low-latency connectivity between the two Availability Zones.

This pattern is foundational if you are looking for high availability cluster solutions for automated failover to fulfill near-zero recovery point and time objectives. SQL AlwaysOn with Windows cluster for automatic failover provides resiliency against failure scenarios, including the rare occurrence of loss of Availability Zone.

You need to consider the additional cost of licensing for AlwaysOn configuration. Also, provisioning production equivalent instance type as standby adds to the total cost of ownership.

Microsoft SQL server backups can be stored in Amazon S3 buckets. Amazon S3 objects are automatically stored across multiple devices, spanning a minimum of three Availability Zones across a Region. To protect against logical data loss, you can use the [Same-Region Replication](https://aws.amazon.com/about-aws/whats-new/2019/09/amazon-s3-introduces-same-region-replication/) feature of Amazon S3.

With Same-Region replication, you can setup automatic replication of an Amazon S3 bucket in a separate AWS account. This strategy ensures that not all copies of data are lost due to malicious activity or human error. To setup Same-Region replication, see [Setting up replication](https://docs.aws.amazon.com/AmazonS3/latest/userguide/replication-how-setup.html).

![\[Replication with two Availability Zones in a single Region\]](http://docs.aws.amazon.com/sap/latest/general/images/sql-pattern1.png)


## Pattern 2: Single Region with one Availability Zone for production
<a name="pattern2"></a>

In this pattern, Microsoft SQL server is deployed as a standalone installation with no target systems to replicate data. This is the most basic and cost-efficient deployment option. The options available to restore business operations during a failure scenario are by Amazon EC2 auto recovery, in the event of an instance failure or by restoration and recovery from most recent and valid backups, in the event of a significant issue impacting the Availability Zone.

![\[Replication with one Availability Zone in a single Region\]](http://docs.aws.amazon.com/sap/latest/general/images/sql-pattern2.png)


# Multi-Region patterns for Microsoft SQL server
<a name="multi-region"></a>

 AWS Global Infrastructure spans across multiple Regions around the world and this footprint is constantly increasing. For the latest updates, see [AWS Global Infrastructure](https://aws.amazon.com/about-aws/global-infrastructure/). If you are looking for your SAP data to reside in multiple regions at any given point to ensure increased availability and minimal downtime in the event of failure, you should opt for multi-Region architecture patterns.

When deploying a multi-Region pattern, you can benefit from using an automated approach such as, cluster solution, for fail over between Availability Zones to minimize the overall downtime and remove the need for human intervention. Multi-Region patterns not only provide high availability but also disaster recovery, thereby lowering overall costs. Distance between the chosen regions have direct impact on latency and hence, in a multi-Region pattern, this has to be considered into the overall design.

There are additional cost implications from cross-Region replication or data transfer that also need to be factored into the overall solution pricing. The pricing varies between Regions.

The following are the four multi-Region architecture patterns.

**Topics**
+ [Pattern 3: Primary Region with two Availability Zones for production and secondary Region with a replica of backups/AMIs](#pattern3)
+ [Pattern 4: Primary Region with two Availability Zones for production and secondary Region with compute and storage capacity deployed in a single Availability Zone](#pattern4)
+ [Pattern 5: Primary Region with one Availability Zone for production and a secondary Region with a replica of backups/AMIs](#pattern5)
+ [Pattern 6: Primary Region with one Availability Zone for production and a secondary Region replicated at block level using AWS Elastic Disaster Recovery](#pattern6)

## Pattern 3: Primary Region with two Availability Zones for production and secondary Region with a replica of backups/AMIs
<a name="pattern3"></a>

This pattern is similar to pattern 1 where your Microsoft SQL server is highly available. You deploy your production instance across two Availability Zones in the primary Region using AlwaysOn. You can restore your SQL database in a secondary Region with a replica of backups stores in Amazon S3, Amazon EBS, and Amazon Machine Images (AMIs).

With cross-Region replication of files stored in Amazon S3, the data stored in a bucket is automatically (asynchronously) copied to the target Region. Amazon EBS snapshots can be copied between Regions. For more information, see [Copy an Amazon EBS snapshot](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-copy-snapshot.html). You can copy an AMI within or across Regions using AWS CLI, AWS Management Console, AWS SDKs or Amazon EC2 APIs. For more information, see [Copy an AMI](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/CopyingAMIs.html). You can also use AWS Backup to schedule and run snapshots and replications across Regions.

In the event of a complete Region failure, the production SQL server needs to be built in the secondary Region using AMI. You can use AWS CloudFormation templates to automate the launch of a new SQL server. Once your instance is launched, you can then download the last set of backup from Amazon S3 to restore your SQL server to a point-in-time before the disaster event. After restoring and recovering your SQL server in the secondary Region, you can redirect your client traffic to the new instance using DNS.

This architecture provides you with the advantage of implementing your SQL server across multiple Availability Zones with the ability to failover instantly in the event of a failure. For disaster recovery that is outside the primary Region, recovery point objective is constrained by how often you store your SQL backup files in your Amazon S3 bucket, and the time it takes to replicate your Amazon S3 bucket to the target Region. You can use Amazon S3 replication time control for a time-bound replication. For more information, see [Enabling Amazon S3 Replication Time Control](https://docs.aws.amazon.com/AmazonS3/latest/userguide/replication-time-control.html#enabling-replication-time-control).

Your recovery time objective depends on the time it takes to build the system in the secondary Region and restore operations from backup files. The amount of time will vary depending on the size of the database. Also, the time required to get the compute capacity for restore procedures may be more in the absence of a reserved instance capacity. This pattern is suitable when you need the lowest possible recovery time and point objectives within a Region and high recovery point and time objectives for disaster recovery outside the primary Region.

![\[Amazon S3 cross Region replication\]](http://docs.aws.amazon.com/sap/latest/general/images/sql-pattern3.png)


## Pattern 4: Primary Region with two Availability Zones for production and secondary Region with compute and storage capacity deployed in a single Availability Zone
<a name="pattern4"></a>

In addition to the architecture of pattern 3, this pattern has SQL AlwaysOn setup between the SQL server in the primary Region and an identical third instance in one of the Availability Zones in the secondary Region. We recommend using the asynchronous (async) mode for SQL AlwaysOn when replicating between AWS Regions due to increased latency.

In the event of a failure in the primary Region, the production workloads are failed over to the secondary Region manually. This pattern ensures that your SAP systems are highly available and are disaster-tolerant. This pattern provides a quicker failover and continuity of business operations with continuous data replication.

There is an increased cost of deploying the required compute and storage for the production SQL server in the secondary Region and of data transfers between Regions. This pattern is suitable when you require disaster recovery outside of the primary Region with low recovery point and time objectives.

This pattern can be deployed in a multi-tier as well as multi-target replication configuration.

The following diagram shows a multi-tier replication where the replication is configured in a chained fashion.

![\[Amazon S3 cross Region replication\]](http://docs.aws.amazon.com/sap/latest/general/images/sql-pattern4.png)


## Pattern 5: Primary Region with one Availability Zone for production and a secondary Region with a replica of backups/AMIs
<a name="pattern5"></a>

This pattern is similar to pattern 2 with additional disaster recovery in a secondary Region containing replicas of the SQL server backups stored in Amazon S3, Amazon EBS snapshots, and AMIs. In this pattern, the SQL server is deployed as a standalone installation in the primary Region in one Availability Zone with no target SQL systems to replicate data.

With this pattern, your SQL server is not highly available. In the event of a complete Region failure, the production SQL server needs to be built in the secondary Region using AMI. You can use AWS CloudFormation templates to automate the launch of a new SQL server. Once your instance is launched, you can then download the last set of backup from Amazon S3 to restore your SQL server to a point-in-time before the disaster event. You can then redirect your client traffic to the new instance in the secondary Region using DNS.

For disaster recovery that is outside the primary Region, recovery point objective is constrained by how often you store your SQL backup files in your Amazon S3 bucket and the time it takes to replicate your Amazon S3 bucket to the target Region. Your recovery time objective depends on the time it takes to build the system in the secondary Region and restore operations from backup files. The amount of time will vary depending on the size of the database. This pattern is suitable for non-production or non-critical production systems that can tolerate a downtime required to restore normal operations.

![\[Amazon S3 cross Region replication\]](http://docs.aws.amazon.com/sap/latest/general/images/sql-pattern5.png)


## Pattern 6: Primary Region with one Availability Zone for production and a secondary Region replicated at block level using AWS Elastic Disaster Recovery
<a name="pattern6"></a>

 AWS Elastic Disaster Recovery provides organizations with a modern approach to protecting Microsoft SQL server environments by enabling cloud-based disaster recovery on AWS Cloud. For more information, see [What is Elastic Disaster Recovery](https://docs.aws.amazon.com/drs/latest/userguide/what-is-drs.html)?

Elastic Disaster Recovery uses block level replication and replicates the operating system, databases, application, and system files for supported Windows and Linux operating system versions. To learn more, see [Supported operating systems](https://docs.aws.amazon.com/drs/latest/userguide/Supported-Operating-Systems.html). An initial setup of the AWS Replication Agent is required on the source systems for Elastic Disaster Recovery to initiate secure data replication. The agent runs in memory and recognizes write operations to locally attached disks. These writes are captured and asynchronously replicated into a staging area in your AWS account. During this ongoing replication process, Elastic Disaster Recovery maintains the write order among all disks in the same source server. The replicated Amazon EC2 instances can be run in a *test mode* to perform drills in a segregated environment.

Elastic Disaster Recovery allows you to monitor the data replication status of your recovery instances, view recovery instance details, add recovery instances to Elastic Disaster Recovery, edit recovery instance failback settings, and terminate recovery instances.

With Elastic Disaster Recovery, you can perform a failover by launching recovery instances on AWS Cloud. Once the recovery instance is launched, you must redirect the traffic from your primary site to the recovery site.

 AWS Elastic Disaster Recovery uses Amazon EBS snapshots to take point-in-time snapshots of data held within the staging area. To learn more, see [Amazon EBS snapshots](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSSnapshots.html). It then provides crash consistent point-in-time recovery options that can be used in the event of a disaster or drill. Elastic Disaster Recovery can protect individual nodes of the SQL Server Always On availability group. During disaster recovery, the group is launched as individual SQL server instances on AWS. This solution works for both the SQL Server Standard edition and SQL Server Enterprise edition for any supported version of the SQL server.

![\[Amazon S3 cross Region replication\]](http://docs.aws.amazon.com/sap/latest/general/images/sql-pattern6.png)