

# AWS service types
<a name="aws-service-types"></a>

 AWS operates three different categories of services based on their fault isolation boundary: zonal, Regional, and global. This section will describe in more detail how these different types of services have been designed so that you can determine how failures within a service of a certain service type will impact your workload running on AWS. It also provides high level guidance on how to architect your workloads to use these services in a resilient way. For global services, this document also provides prescriptive guidance in [Appendix A - Partitional service guidance](appendix-a---partitional-service-guidance.md) and [Appendix B - Edge network global service guidance](appendix-b---edge-network-global-service-guidance.md) that can help you prevent impact to your workloads from control plane impairments in AWS services, helping you to safely take dependencies on global services while minimizing introducing single points of failure. 

**Topics**
+ [

# Zonal services
](zonal-services.md)
+ [

# Regional services
](regional-services.md)
+ [

# Global services
](global-services.md)

# Zonal services
<a name="zonal-services"></a>

 [https://aws.amazon.com/builders-library/static-stability-using-availability-zones/](https://aws.amazon.com/builders-library/static-stability-using-availability-zones/) enables AWS to offer zonal services, like Amazon EC2 and Amazon EBS. A zonal service is one that provides the ability to specify which Availability Zone the resources are deployed into. These services operate independently in each Availability Zone within a Region, and more importantly, fail independently in each Availability Zone as well. This means that components of a service in one Availability Zone don’t take dependencies on components in other Availability Zones. We can do this because a zonal service has **zonal data planes.** In some cases, such as with EC2, the service also includes zonal control planes for zonally aligned operations, such as launching an EC2 instance. For those services, AWS also provides a regional control plane endpoint to make it easy to interact with the service. The regional control plane also provides Regionally-scoped functionality as well as serves as an aggregation and routing layer on top of the zonal control planes. This is shown in the following figure. 

![\[This image shows a zonal service with zonally isolated control planes and data planes\]](http://docs.aws.amazon.com/whitepapers/latest/aws-fault-isolation-boundaries/images/a-zonal-service-with-zonally-isolated-control-planes-and-data-planes.png)


 Availability Zones give customers the ability to operate production workloads that are more highly available, fault tolerant, and scalable than would be possible from a single data center. When a workload uses multiple Availability Zones, customers are better isolated and protected from issues that impact a single Availability Zone’s physical infrastructure. This helps customers to build services that are redundant across Availability Zones and, if architected correctly, remain operational even if one Availability Zone experiences failures. Customers can take advantage of AZI to create highly-available and resilient workloads. Implementing AZI in your architecture helps you to quickly recover from an isolated Availability Zone failure because your resources in one Availability Zones minimize or eliminate interaction with resources in other Availability Zones. This helps remove cross- Availability Zone dependencies which simplifies Availability Zone evacuation. Refer to [Advanced Multi-AZ Resilience Patterns](https://docs.aws.amazon.com/whitepapers/latest/advanced-multi-az-resilience-patterns/advanced-multi-az-resilience-patterns.html) for more details on creating Availability Zone evacuation mechanisms. Additionally, you can further take advantage of Availability Zones by following some of the same best practices AWS uses for its own services, such as only deploying changes to a single Availability Zone at a time or removing an Availability Zone from service if a change in that Availability Zone goes badly. 

 [Static stability](static-stability.md) is also an important concept for Multi-Availability Zone architectures. One of the failure modes you should plan for with Multi-Availability Zone architectures is the loss of an Availability Zone, which can result in the loss of an Availability Zone’s worth of capacity. If you haven’t pre-provisioned enough capacity to handle the loss of an Availability Zone, this could result in your remaining capacity being overwhelmed by the current load. Additionally, you will need to depend on the control planes of the zonal services you use to replace that lost capacity, which can be less reliable than a statically-stable design. In this case, pre-provisioning enough extra capacity can help you be statically-stable to the loss of a fault domain, such as an Availability Zone, by being able to continue normal operations without the need for dynamic changes. 

 You may choose to use an auto scaling group of EC2 instances deployed across multiple Availability Zones to dynamically scale in and out, based on the needs of your workload. Auto scaling works well for gradual changes in usage that occur over minutes to tens of minutes. However, launching new EC2 instances takes time, especially if your instances require bootstrapping (such as installing agents, application binaries, or configuration files). During this time, your remaining capacity could be overwhelmed by the current load. Additionally, deploying new instances through auto scaling relies on the EC2 control plane. This presents a trade-off: To be statically-stable to the loss of a single Availability Zone, you need to pre-provision enough EC2 instances in the other Availability Zones to handle the load that has been shifted away from the impaired Availability Zone, instead of relying on auto scaling to provision new instances. However, pre-provisioning extra capacity can incur additional cost. 

 For example, during normal operation, let’s assume your workload requires six instances to serve customer traffic across three Availability Zones. To be statically-stable against a single Availability Zone failure, you would deploy three instances in each Availability Zone, for a total of nine. If a single Availability Zone-worth of instances failed, you would still have six left and be able to continue to serve your customer traffic without the need to provision and configure new instances during the failure. Achieving static stability for your EC2 capacity has additional cost, since, in this case, you are running 50% additional instances. Not all services where you can pre-provision resources will incur additional cost, such as pre-provisioning an S3 bucket or a user. You will need to weigh any trade-offs of implementing static stability against the risk of exceeding the desired recovery time for your workload. 

 AWS Local Zones and Outposts bring the data plane of select AWS services closer to end users. The control planes for these services reside in the parent Region. Your Local Zone or Outposts instance will have control plane dependencies for zonal services like EC2 and EBS on the Availability Zone where you created your Local Zone or Outposts subnet. They will also have dependencies on Regional control planes for Regional services like Elastic Load Balancing (ELB), security groups, and the Amazon Elastic Kubernetes Service ([Amazon EKS](https://docs.aws.amazon.com/eks/latest/userguide/local-zones.html))-managed Kubernetes control plane (if you use EKS). For additional information specific to Outposts, refer to the [documentation](https://docs.aws.amazon.com/outposts/latest/userguide/disaster-recovery-resiliency.html) and [support and maintenance FAQ](https://aws.amazon.com/outposts/rack/faqs/). Implement static stability when using Local Zones or Outposts to help improve resilience to control plane impairments or interruptions in network connectivity to the parent Region. 

# Regional services
<a name="regional-services"></a>

 Regional services are services that AWS has built on top of multiple Availability Zones so that customers don’t have to figure out how to make the best use of zonal services. We logically group together the service deployed across multiple Availability Zones to present a single Regional endpoint to customers. Amazon SQS and [Amazon DynamoDB](https://aws.amazon.com/dynamodb/) are examples of Regional services. They use the isolation and redundancy of Availability Zones to minimize infrastructure failure as a category of availability and durability risk. Amazon S3, for example, spreads requests and data across multiple Availability Zones and is designed to automatically recover from the failure of an Availability Zone. However, you only interact with the Regional endpoint of the service. 

 AWS believes that most customers can achieve their resilience goals in a single Region by using Regional services or Multi-AZ architectures that rely on zonal services. However, some workloads may require additional redundancy, and you can use the isolation of AWS Regions to create Multi-Region architectures for HA or business continuity purposes. The physical and logical separation between AWS Regions avoids correlated failures between them. In other words, similar to if you were an EC2 customer and could benefit from the isolation of Availability Zones by deploying across them, you can get that same benefit for Regional services by deploying across multiple Regions. This requires that you implement a multi-Region architecture for your application, which can help you be resilient to the impairment of a Regional service. 

 However, achieving the benefits of a Multi-Region architecture can be difficult; it requires careful work to take advantage of Regional isolation while not undoing anything at the application level. For example, if you’re failing over an application between Regions, you need to maintain strict separation between your application stacks in each Region, be aware of all the application dependencies, and failover all parts of the application together. Achieving this with a complex, microservices-based architecture that has many dependencies between applications requires planning and coordination amongst many engineering and business teams. Allowing individual workloads to make their own failover decisions makes the coordination less complex, but introduces modal behavior through the significant difference in latency that occurs across Regions compared to inside a single Region. 

 AWS does not provide a synchronous Cross-Region replication feature at this time. When using an asynchronously replicated datastore (provided by AWS) across Regions, there is the possibility of data loss or inconsistency when you fail over your application between Regions. To mitigate possible inconsistencies, you need a reliable data reconciliation process that you have confidence in and may need to operate on multiple data stores across your workload portfolio, or you need to be willing to accept data loss. Finally, you need to practice the failover to know that it will work when you need it. Regularly rotating your application between Regions to practice failover is a substantial time and resource investment. If you decide to use a synchronously replicated datastore across Regions to support your applications running from more than one Region concurrently, the performance characteristics and latency of such a database that spans 100s or 1000s of miles is very different from a database operating in a single Region. This requires you to plan your application stack from the ground up to account for this behavior. It also makes the availability of both Regions a hard dependency, which could result in decreased resilience of your workload. 

# Global services
<a name="global-services"></a>

 In addition to Regional and zonal AWS services, there is a small set of AWS services whose control planes and data planes don’t exist independently in each Region. Because their resources are not Region-specific, they are commonly referred to as *global*. Global AWS services still follow the conventional AWS design pattern of separating the control plane and data plane in order to achieve static stability. The significant difference for most global services is that their control plane is hosted in a *single* AWS Region, while their data plane is globally distributed. There are three different types of global services and a set of services that can appear to be global based on your selected configuration. 

 The following sections will identify each type of global service and how their control planes and data planes are separated. You can use this information to guide how you build reliable high availability (HA) and disaster recovery (DR) mechanisms without needing to depend on a global service control plane. This approach helps remove single points of failure in your architecture and avoids potential cross-Region impacts, even when you are operating in a Region that is different from where the global service control plane is hosted. It also helps you safely implement failover mechanisms that do not rely on global service control planes. 

## Global services that are unique by partition
<a name="global-services-that-are-unique-by-partition"></a>

 Some global AWS services exist in each partition (referred to in this paper as *partitional* services). Partitional services provide their control plane in a single AWS Region. Some partitional services, such as AWS Network Manager, are control plane-only and orchestrate the data plane of other services. Other partitional services, such as IAM, have their own data plane that is isolated and distributed across all of the AWS Regions in the partition. Failures in a partitional service do not impact other partitions. In the `aws` partition, the IAM service’s control plane is in the `us-east-1` Region, with isolated data planes in each Region of the partition. Partitional services also have independent control planes and data planes in the `aws-us-gov` and `aws-cn` partitions. The separation of control plane and data plane for IAM is shown in the following diagram. 

![\[This image illustrates that IAM has a single control plane and regionalized data plane\]](http://docs.aws.amazon.com/whitepapers/latest/aws-fault-isolation-boundaries/images/iam-single-control-plane-and-regionalized-data-plane.png)


 The following are partitional services and their control plane location in the `aws` partition: 
+ AWS IAM (`us-east-1`)
+ AWS Organizations (`us-east-1`)
+ AWS Account Management (`us-east-1`)
+ Route 53 Application Recovery Controller (ARC) (`us-west-2`) - This service is only present in the `aws` partition
+ AWS Network Manager (`us-west-2`)
+ Route 53 Private DNS (`us-east-1`)

 If any of these service control planes have an availability-impacting event, you may be unable to use the CRUDL-type operations provided by these services. Thus, if your recovery strategy has a dependency on these operations, an availability impact to the control plane or the Region hosting the control plane will reduce your chances of successful recovery. [Appendix A - Partitional service guidance](appendix-a---partitional-service-guidance.md) provides strategies for removing dependencies on global service control planes during recovery. 

****Recommendation****  
Do not rely on the control planes of partitional services in your recovery path. Instead, rely on the data plane operations of these services. See [Appendix A - Partitional service guidance](appendix-a---partitional-service-guidance.md) for additional details on how you should design for partitional services.

## Global services in the edge network
<a name="global-services-in-the-edge-network"></a>

 The next set of global AWS services have a control plane in the `aws` partition and host their data planes in the global [points of presence](points-of-presence.md) (PoP) infrastructure (and potentially AWS Regions as well). The data planes hosted in PoPs can be accessed from resources in any partition as well as the internet. For example, Route 53 operates its control plane in the `us-east-1` Region, but its data plane is distributed across hundreds of PoPs globally, as well as each AWS Region (to support Route 53 Public and Private DNS within the Region). Route 53 health checks are also part of the data plane, and are performed from eight AWS Regions in the `aws` partition. Clients can resolve DNS using Route 53 public hosted zones from anywhere on the internet, including other partitions like GovCloud, as well as from an AWS Virtual Private Cloud (VPC). The following are global edge network services and their control plane location in the `aws` partition: 
+ Route 53 Public DNS (`us-east-1`)
+ Amazon CloudFront (`us-east-1`)
+ AWS WAF Classic for CloudFront (`us-east-1`)
+ AWS WAF for CloudFront (`us-east-1`)
+ Amazon Certificate Manager (ACM) for CloudFront (`us-east-1`)
+ AWS Global Accelerator (AGA) (`us-west-2`)
+ AWS Shield Advanced (`us-east-1`)

 If you use AGA health checks for EC2 instances or Elastic IP addresses, these use Route 53 health checks. Creating or updating AGA health checks would depend on the Route 53 control plane in `us-east-1`. The execution of the AGA health checks utilizes the Route 53 health check data plane. 

 During a failure impacting the Region hosting the control planes for these services, or a failure impacting the control plane itself, you may be unable to use the CRUDL-type operations provided by these services. If you have taken dependencies on these operations in your recovery strategy, that strategy may be less likely to succeed than if you only rely on the data plane of these services. 

****Recommendation****  
Do not rely on the control plane of edge network services in your recovery path. Instead, rely on the data plane operations of these services. See [Appendix B - Edge network global service guidance](appendix-b---edge-network-global-service-guidance.md) for additional details on how to design for global services in the edge network.

## Global Single-Region operations
<a name="global-single-region-operations"></a>

 The final category is composed of specific control plane operations within a service that have a global impact scope, not entire services like the previous categories. While you interact with zonal and Regional services in the Region you specify, certain operations have an underlying dependency on a single Region that is different from where the resource is located. These are different than services that are only provided in a single Region; refer to [Appendix C - Single-Region services](appendix-c---single-region-services.md) for a list of those services. 

 During a failure impacting the underlying global dependency, you may be unable to use the CRUDL-type actions of the dependent operations. If you have taken dependencies on these operations in your recovery strategy, that strategy may be less likely to succeed than if you only rely on the data plane of these services. You should avoid dependencies on these operations for your recovery strategy. 

 The following is a list of services that other services may take dependencies on, which have global scope: 
+  **Route 53** 

  Several AWS services create resources that provide a resource-specific DNS name(s). For example, when you provision an Elastic Load Balancer (ELB), the service creates public DNS records and health checks in Route 53 for the ELB. This relies on the Route 53 control plane in `us-east-1`. Other services that you use might also need to provision an ELB, create public Route 53 DNS records, or create Route 53 health checks as part of their control plane workflows. For example, provisioning an Amazon API Gateway REST API resource, Amazon ELB load balancer, or an Amazon OpenSearch Service domain all result in creating DNS records in Route 53. The following is a list of services whose control plane depends on the Route 53 control plane in `us-east-1` to create, update, or delete DNS records, hosted zones, and/or create Route 53 health checks. This list is not exhaustive; it is meant to highlight some of the most commonly-used services whose control plane actions for creating, updating, or deleting resources depend on the Route 53 control plane: 
  + Amazon API Gateway REST and HTTP APIs
  + Amazon ELB load balancers
  + AWS PrivateLink VPC endpoints
  + AWS Lambda URLs
  + Amazon ElastiCache
  + Amazon OpenSearch Service
  + Amazon CloudFront
  + Amazon MemoryDB
  + Amazon Neptune
  + Amazon DynamoDB Accelerator (DAX)
  + AGA
  + Amazon Elastic Container Service (Amazon ECS) with DNS-based Service Discovery (which uses the AWS Cloud Map API to manage Route 53 DNS)
  + Amazon EKS Kubernetes control plane

    It is important to note that the VPC DNS service for [EC2 instance hostnames](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-naming.html) exists independently in each AWS Region and does not depend on the Route 53 control plane. Records that AWS creates for EC2 instances in the VPC DNS service, like `ip-10-0-10.ec2.internal`, `ip-10-0-1-5.compute.us-west-2.compute.internal`, `i-0123456789abcdef.ec2.internal`, and `i-0123456789abcdef.us-west-2.compute.internal`, do not rely on the Route 53 control plane in `us-east-1`. 
****Recommendation****  
Do not rely on creating, updating, or deleting resources that require the creation, updating, or deletion of Route 53 resource records, hosted zones, or health checks in your recovery path. Pre-provision these resources, like ELBs, to prevent a dependency on the Route 53 control plane in your recovery path.
+  **Amazon S3** 

  The following Amazon S3 control plane operations have an underlying dependency on `us-east-1` in the `aws` partition. A failure impacting Amazon S3 or other services in `us-east-1` could cause these control planes actions to be impaired in other Regions: 

  ```
  PutBucketCors 
  DeleteBucketCors 
  PutAccelerateConfiguration 
  PutBucketRequestPayment 
  PutBucketObjectLockConfiguration 
  PutBucketTagging 
  DeleteBucketTagging 
  PutBucketReplication 
  DeleteBucketReplication 
  PutBucketEncryption 
  DeleteBucketEncryption 
  PutBucketLifecycle 
  DeleteBucketLifecycle 
  PutBucketNotification 
  PutBucketLogging 
  DeleteBucketLogging 
  PutBucketVersioning 
  PutBucketPolicy 
  DeleteBucketPolicy 
  PutBucketOwnershipControls 
  DeleteBucketOwnershipControls 
  PutBucketAcl 
  PutBucketPublicAccessBlock 
  DeleteBucketPublicAccessBlock
  ```

  The control plane for Amazon S3 Multi-Region Access Points (MRAP) is [hosted only in `us-west-2`](https://docs.aws.amazon.com/AmazonS3/latest/userguide/MrapOperations.html) and requests to create, update, or delete MRAPs target that Region directly. The control plane for MRAP also has underlying dependencies on AGA in `us-west-2`, Route 53 in `us-east-1`, and ACM in each Region where the MRAP is configured to serve content from. You should not depend on the availability of the MRAP control plane in your recovery path or in your own systems’ data planes. This is distinct from [MRAP failover controls](https://docs.aws.amazon.com/AmazonS3/latest/userguide/MrapFailover.html) that are used to specify active or passive routing status for each of your buckets in the MRAP. These APIs are hosted in [five AWS Regions](https://docs.aws.amazon.com/AmazonS3/latest/userguide/MrapOperations.html#update-mrap-route-configuration) and can be used to effectively shift traffic using the service's data plane. 

  Additionally, Amazon S3 [bucket names are globally unique](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html) and all calls to the `CreateBucket` and `DeleteBucket` APIs depend on `us-east-1`, in the `aws` partition, to ensure name uniqueness, even though the API call is directed at the specific Region in which you want to create the bucket. Finally, if you have critical bucket creation workflows, you should not depend on the availability of any specific spelling of a bucket name, particularly those following a discernible pattern. 
****Recommendation****  
Do not rely on deleting or creating new S3 buckets or updating S3 bucket configurations as part of your recovery path. Pre-provision all required S3 buckets with the necessary configurations so that you do not need to make changes in order to recover from a failure. This approach applies to MRAPs as well.
+  **CloudFront** 

   Amazon API Gateway provides [edge-optimized API endpoints](https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-api-endpoint-types.html#api-gateway-api-endpoint-types-edge-optimized). Creating these endpoints depends on the CloudFront control plane in `us-east-1` to create the distribution in front of the gateway endpoint.
****Recommendation****  
Do not rely on creating new edge-optimized API Gateway endpoints as part of your recovery path. Pre-provision all required API Gateway endpoints.

  All of the dependencies discussed in this section are control plane actions, not data plane actions. If your workloads are configured to be statically-stable, these dependencies should not impact your recovery path, keeping in mind that static stability requires additional work or services to implement. 

## Services that use default global endpoints
<a name="services-that-use-default-global-endpoints"></a>

 In a few cases, AWS services provide a default, global endpoint, like AWS Security Token Service ([AWS STS](https://docs.aws.amazon.com/general/latest/gr/sts.html)). Other services may use this default, global endpoint in their default configuration. This means that a Regional service you are using could have a global dependency on a single AWS Region. The following details explain how to remove unintended dependencies on default global endpoints that will help you use the service in a Regional way. 

 **AWS STS:** STS is a web service that enables you to request temporary, limited-privilege credentials for IAM users or for users you authenticate (federated users). STS usage from the AWS software development kit (SDK) and command line interface (CLI) defaults to `us-east-1`. The STS service also provides Regional endpoints. These endpoints are enabled by default in Regions that are also enabled by default. You can take advantage of these at any time by configuring your SDK or CLI following these directions: [AWS STS Regionalized endpoints](https://docs.aws.amazon.com/sdkref/latest/guide/feature-sts-regionalized-endpoints.html). Using SigV4A also [requires temporary credentials requested from a Regional STS endpoint](https://docs.aws.amazon.com/general/latest/gr/signing_aws_api_requests.html#signature-versions). You cannot use the global STS endpoint for this operation. 

****Recommendation****  
Update your SDK and CLI configuration to use the Regional STS endpoints.

 **Security Assertion Markup Language (SAML) Sign-in:** SAML services exist in all AWS Regions. To use this service, choose the appropriate regional SAML endpoint, like [https://us-west-2.signin.aws.amazon.com/saml](https://us-west-2.signin.aws.amazon.com/saml). You must make updates to configurations in your trust policies and Identity Provider (IdP) to use the regional endpoints. Refer to the [AWS SAML documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_saml.html) for specific details. 

 If you are using an IdP that is also hosted on AWS, there is a risk that they may also be impacted during an AWS failure event. This could result in you not being able to update your IdP configuration or you may be unable to federate entirely. You should pre-provision “break-glass” users in case your IdP is impaired or unavailable. Refer to [Appendix A - Partitional service guidance](appendix-a---partitional-service-guidance.md) for details on how to create break-glass users in a statically-stable way. 

****Recommendation****  
Update your IAM role trust policies to accept SAML logins from multiple Regions. During a failure, update your IdP configuration to use a different Regional SAML endpoint if your preferred endpoint is impaired. Create a break-glass user(s) in case your IdP is impaired or unavailable.

 **AWS IAM Identity Center:** Identity Center is a cloud-based service that makes it easy to centrally manage single sign-on access to a customer’s AWS accounts and cloud applications. Identity Center must be deployed in a single Region of your choosing. However, the default behavior for the service is to use the global SAML endpoint ([https://signin.aws.amazon.com/saml](https://signin.aws.amazon.com/saml)), which is hosted in `us-east-1`. If you have deployed Identity Center into a different AWS Region, you should update the [relaystate](https://docs.aws.amazon.com/singlesignon/latest/userguide/howtopermrelaystate.html) URL of every permission set to target the same Regional console endpoint as your Identity Center deployment. For example, if you deployed Identity Center into `us-west-2`, you should update the relaystate of your permissions sets to use [https://us-west-2.console.aws.amazon.com](https://us-west-2.console.aws.amazon.com). This will remove any dependency on `us-east-1` from your Identity Center deployment. 

 Additionally, because IAM Identity Center can only be deployed into a single Region, you should pre-provision “break-glass” users in case your deployment is impaired. Refer to [Appendix A - Partitional service guidance](appendix-a---partitional-service-guidance.md) for details on how to create break-glass users in a statically-stable way. 

****Recommendation****  
Set the relaystate URL of your permission sets in IAM Identity Center to match the Region where you have the service deployed. Create a break-glass user(s) in case your IAM Identity Center deployment is unavailable.

 **Amazon S3 Storage Lens:** Storage Lens provides a default dashboard called default-account-dashboard. The dashboard configuration and its associated metrics are stored in `us-east-1`. You can create additional dashboards in other Regions by specifying the [home Region](https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage_lens_basics_metrics_recommendations.html#storage_lens_basics_home_region) for the dashboard configuration and metric data. 

****Recommendation****  
If you require data from the default S3 Storage Lens dashboard during a failure impacting the service in `us-east-1`, create an additional dashboard in an alternate home Region. You can also duplicate any other custom dashboards you have created in additional Regions.

## Global services summary
<a name="global-services-summary"></a>

 The data planes for global services apply similar isolation and independence principles as Regional AWS services. A failure impacting the data plane of IAM in a Region doesn’t affect the operation of the IAM data plane in another AWS Region. Similarly, a failure impacting the data plane of Route 53 in a PoP doesn’t affect the operation of the Route 53 data plane in the rest of the PoPs. Therefore, what we must consider are service availability events that affect the Region where the control plane operates or affect the control plane itself. Because there is only a single control plane for each global service, a failure affecting that control plane could have cross-Region effects on CRUDL-type operations (which are the configuration operations that are typically used to set up or configure a service as opposed to the direct use of the service). 

 The most effective way to architect workloads to use global services resiliently is to use static stability. During a failure scenario, design your workload not to need to make changes with a control plane to mitigate the impact or failover to a different location. Refer to [Appendix A - Partitional service guidance](appendix-a---partitional-service-guidance.md) and [Appendix B - Edge network global service guidance](appendix-b---edge-network-global-service-guidance.md) for prescriptive guidance on how to utilize these types of global services in order to remove control plane dependencies and eliminate single points of failure. If you require the data from a control plane operation for recovery, cache this data in a data store that can be accessed through its data plane, like an [AWS Systems Manager](https://aws.amazon.com/systems-manager/) Parameter Store (SSM Parameter Store) parameter, a DynamoDB table, or an S3 bucket. For redundancy, you may also choose to store that data in an additional Region. For example, following the [best practices](https://docs.aws.amazon.com/r53recovery/latest/dg/route53-arc-best-practices.html) for Route 53 Application Recovery Controller (ARC), you should hardcode or bookmark your five Regional cluster endpoints. During a failure event, you might not be able to access some API operations, including Route 53 ARC API operations that are not hosted on the extremely reliable data plane cluster. You can list the endpoints for your Route 53 ARC clusters by using the `DescribeCluster` API operation. 

 The following is a summary of some of the most common misconfigurations or anti-patterns that introduce dependencies on global services’ control planes: 
+  Making changes to Route 53 records, like updating an A record’s value or changing a weighted record set’s weights, to perform failover. 
+  Creating or updating IAM resources, including IAM roles and policies, during a failover. This typically isn’t intentional, but might be a result of an untested failover plan. 
+  Relying on IAM Identity Center for operators to gain access to production environments during a failure event. 
+  Relying on the default IAM Identity Center configuration to utilize the console in `us-east-1` when you have deployed Identity Center into a different Region. 
+  Making changes to AGA traffic dial weights to manually perform a Regional failover. 
+  Updating a CloudFront distribution’s origin configuration to fail away from an impaired origin. 
+  Provisioning disaster recovery (DR) resources, like ELBs and RDS instances during a failure event, that depend on creating DNS records in Route 53. 

 The following is a summary of the recommendations provided in this section for using global services in a resilient way that would help prevent the previous common anti-patterns. 

****Recommendation summary****  
Do not rely on the control planes of partitional services in your recovery path. Instead, rely on the data plane operations of these services. See [Appendix A - Partitional service guidance](appendix-a---partitional-service-guidance.md) for additional details on how you should design for partitional services.   
 Do not rely on the control plane of edge network services in your recovery path. Instead, rely on the data plane operations of these services. See [Appendix B - Edge network global service guidance](appendix-b---edge-network-global-service-guidance.md) for additional details on how to design for global services in the edge network.   
 Do not rely on creating, updating, or deleting resources that require the creation, updating, or deletion of Route 53 resource records, hosted zones, or health checks in your recovery path. Pre-provision these resources, like ELBs, to prevent a dependency on the Route 53 control plane in your recovery path.   
 Do not rely on deleting or creating new S3 buckets or updating S3 bucket configurations as part of your recovery path. Pre-provision all required S3 buckets with the necessary configurations so that you do not need to make changes in order to recover from a failure. This approach applies to MRAPs as well.   
 Do not rely on creating new edge-optimized API Gateway endpoints as part of your recovery path. Pre-provision all required API Gateway endpoints.   
 Update your SDK and CLI configuration to use the Regional STS endpoints.   
 Update your IAM role trust policies to accept SAML logins from multiple Regions. During a failure, update your IdP configuration to use a different Regional SAML endpoint if your preferred endpoint is impaired. Create break-glass users in case your IdP is impaired or unavailable.   
 Set the relaystate URL of your permission sets in IAM Identity Center to match the Region where you have the service deployed. Create a break-glass user(s) in case your Identity Center deployment is unavailable.   
 If you require data from the default S3 Storage Lens dashboard during a failure impacting the service in `us-east-1`, create an additional dashboard in an alternate home Region. You can also duplicate any other custom dashboards you have created in additional Regions. 