# Foundations
Foundations

 Foundational requirements are those whose scope extends beyond a single workload or project. Before architecting any system, foundational requirements that influence reliability should be in place. For example, you must have sufficient network bandwidth to your data center. 

 In an on-premises environment, these requirements can cause long lead times due to dependencies and therefore must be incorporated during initial planning. With AWS however, most of these foundational requirements are already incorporated or can be addressed as needed. The cloud is designed to be nearly limitless, so it’s the responsibility of AWS to satisfy the requirement for sufficient networking and compute capacity, leaving you free to change resource size and allocations on demand. 

 The following sections explain best practices that focus on these considerations for reliability.

**Topics**
+ [

# Manage service quotas and constraints
](manage-service-quotas-and-constraints.md)
+ [

# Plan your network topology
](plan-your-network-topology.md)

# Manage service quotas and constraints
Manage service quotas and constraints

 For cloud-based workload architectures, there are service quotas (which are also referred to as service limits). These quotas exist to prevent accidentally provisioning more resources than you need and to limit request rates on API operations so as to protect services from abuse. There are also resource constraints, for example, the rate that you can push bits down a fiber-optic cable, or the amount of storage on a physical disk. 

 If you are using AWS Marketplace applications, you must understand the limitations of those applications. If you are using third-party web services or software as a service, you must be aware of those limits also. 

**Topics**
+ [

# REL01-BP01 Aware of service quotas and constraints
](rel_manage_service_limits_aware_quotas_and_constraints.md)
+ [

# REL01-BP02 Manage service quotas across accounts and regions
](rel_manage_service_limits_limits_considered.md)
+ [

# REL01-BP03 Accommodate fixed service quotas and constraints through architecture
](rel_manage_service_limits_aware_fixed_limits.md)
+ [

# REL01-BP04 Monitor and manage quotas
](rel_manage_service_limits_monitor_manage_limits.md)
+ [

# REL01-BP05 Automate quota management
](rel_manage_service_limits_automated_monitor_limits.md)
+ [

# REL01-BP06 Ensure that a sufficient gap exists between the current quotas and the maximum usage to accommodate failover
](rel_manage_service_limits_suff_buffer_limits.md)

# REL01-BP01 Aware of service quotas and constraints
REL01-BP01 Aware of service quotas and constraints

 Be aware of your default quotas and manage your quota increase requests for your workload architecture. Know which cloud resource constraints, such as disk or network, are potentially impactful. 

 **Desired outcome:** Customers can prevent service degradation or disruption in their AWS accounts by implementing proper guidelines for monitoring key metrics, infrastructure reviews, and automation remediation steps to verify that services quotas and constraints are not reached that could cause service degradation or disruption. 

 **Common anti-patterns:** 
+ Deploying a workload without understanding the hard or soft quotas and their limits for the services used. 
+ Deploying a replacement workload without analyzing and reconfiguring the necessary quotas or contacting Support in advance. 
+ Assuming that cloud services have no limits and the services can be used without consideration to rates, limits, counts, quantities.
+  Assuming that quotas will automatically be increased. 
+  Not knowing the process and timeline of quota requests. 
+  Assuming that the default cloud service quota is the identical for every service compared across regions. 
+  Assuming that service constraints can be breached and the systems will auto-scale or add increase the limit beyond the resource’s constraints 
+  Not testing the application at peak traffic in order to stress the utilization of its resources. 
+  Provisioning the resource without analysis of the required resource size. 
+  Overprovisioning capacity by choosing resource types that go well beyond actual need or expected peaks. 
+  Not assessing capacity requirements for new levels of traffic in advance of a new customer event or deploying a new technology. 

 **Benefits of establishing this best practice:** Monitoring and automated management of service quotas and resource constraints can proactively reduce failures. Changes in traffic patterns for a customer’s service can cause a disruption or degradation if best practices are not followed. By monitoring and managing these values across all regions and all accounts, applications can have improved resiliency under adverse or unplanned events. 

 **Level of risk exposed if this best practice is not established:** High 

## Implementation guidance
Implementation guidance

 Service Quotas is an AWS service that helps you manage your quotas for over 250 AWS services from one location. Along with looking up the quota values, you can also request and track quota increases from the Service Quotas console or using the AWS SDK. AWS Trusted Advisor offers a service quotas check that displays your usage and quotas for some aspects of some services. The default service quotas per service are also in the AWS documentation per respective service (for example, see [Amazon VPC Quotas](https://docs.aws.amazon.com/vpc/latest/userguide/amazon-vpc-limits.html)). 

 Some service limits, like rate limits on throttled APIs are set within the Amazon API Gateway itself by configuring a usage plan. Some limits that are set as configuration on their respective services include Provisioned IOPS, Amazon RDS storage allocated, and Amazon EBS volume allocations. Amazon Elastic Compute Cloud has its own service limits dashboard that can help you manage your instance, Amazon Elastic Block Store, and Elastic IP address limits. If you have a use case where service quotas impact your application’s performance and they are not adjustable to your needs, then contact Support to see if there are mitigations. 

 Service quotas can be Region specific or can also be global in nature. Using an AWS service that reaches its quota will not act as expected in normal usage and may cause service disruption or degradation. For example, a service quota limits the number of DL Amazon EC2 instances used in a Region. That limit may be reached during a traffic scaling event using Auto Scaling groups (ASG). 

 Service quotas for each account should be assessed for usage on a regular basis to determine what the appropriate service limits might be for that account. These service quotas exist as operational guardrails, to prevent accidentally provisioning more resources than you need. They also serve to limit request rates on API operations to protect services from abuse. 

 Service constraints are different from service quotas. Service constraints represent a particular resource’s limits as defined by that resource type. These might be storage capacity (for example, gp2 has a size limit of 1 GB - 16 TB) or disk throughput. It is essential that a resource type’s constraint be engineered and constantly assessed for usage that might reach its limit. If a constraint is reached unexpectedly, the account’s applications or services may be degraded or disrupted. 

 If there is a use case where service quotas impact an application’s performance and they cannot be adjusted to required needs, contact Support to see if there are mitigations. For more detail on adjusting fixed quotas, see [REL01-BP03 Accommodate fixed service quotas and constraints through architecture](rel_manage_service_limits_aware_fixed_limits.md). 

 There are a number of AWS services and tools to help monitor and manage Service Quotas. The service and tools should be leveraged to provide automated or manual checks of quota levels. 
+  AWS Trusted Advisor offers a service quota check that displays your usage and quotas for some aspects of some services. It can aid in identifying services that are near quota. 
+  AWS Management Console provides methods to display services quota values, manage, request new quotas, monitor status of quota requests, and display history of quotas. 
+  AWS CLI and CDKs offer programmatic methods to automatically manage and monitor service quota levels and usage. 

 **Implementation steps** 

 For Service Quotas: 
+ [ Review AWS Service Quotas. ](https://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html)
+  To be aware of your existing service quotas, determine the services (like IAM Access Analyzer) that are used. There are approximately 250 AWS services controlled by service quotas. Then, determine the specific service quota name that might be used within each account and Region. There are approximately 3000 service quota names per Region. 
+  Augment this quota analysis with AWS Config to find all [AWS resources](https://docs.aws.amazon.com/config/latest/developerguide/resource-config-reference.html) used in your AWS accounts. 
+  Use [AWS CloudFormation data](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-console-view-stack-data-resources.html) to determine your AWS resources used. Look at the resources that were created either in the AWS Management Console or with the [https://docs.aws.amazon.com/cli/latest/reference/cloudformation/list-stack-resources.html](https://docs.aws.amazon.com/cli/latest/reference/cloudformation/list-stack-resources.html) AWS CLI command. You can also see resources configured to be deployed in the template itself. 
+  Determine all the services your workload requires by looking at the deployment code. 
+  Determine the service quotas that apply. Use the programmatically accessible information from Trusted Advisor and Service Quotas. 
+  Establish an automated monitoring method (see [REL01-BP02 Manage service quotas across accounts and regions](rel_manage_service_limits_limits_considered.md) and [REL01-BP04 Monitor and manage quotas](rel_manage_service_limits_monitor_manage_limits.md)) to alert and inform if services quotas are near or have reached their limit. 
+  Establish an automated and programmatic method to check if a service quota has been changed in one region but not in other regions in the same account (see [REL01-BP02 Manage service quotas across accounts and regions](rel_manage_service_limits_limits_considered.md) and [REL01-BP04 Monitor and manage quotas](rel_manage_service_limits_monitor_manage_limits.md)). 
+  Automate scanning application logs and metrics to determine if there are any quota or service constraint errors. If these errors are present, send alerts to the monitoring system. 
+  Establish engineering procedures to calculate the required change in quota (see [REL01-BP05 Automate quota management](rel_manage_service_limits_automated_monitor_limits.md)) once it has been identified that larger quotas are required for specific services. 
+  Create a provisioning and approval workflow to request changes in service quota. This should include an exception workflow in case of request deny or partial approval. 
+  Create an engineering method to review service quotas prior to provisioning and using new AWS services before rolling out to production or loaded environments. (for example, load testing account). 

 For service constraints: 
+  Establish monitoring and metrics methods to alert for resources reading close to their resource constraints. Leverage CloudWatch as appropriate for metrics or log monitoring. 
+  Establish alert thresholds for each resource that has a constraint that is meaningful to the application or system. 
+  Create workflow and infrastructure management procedures to change the resource type if the constraint is near utilization. This workflow should include load testing as a best practice to verify that new type is the correct resource type with the new constraints. 
+  Migrate identified resource to the recommended new resource type, using existing procedures and processes. 

## Resources
Resources

 **Related best practices:** 
+  [REL01-BP02 Manage service quotas across accounts and regions](rel_manage_service_limits_limits_considered.md) 
+  [REL01-BP03 Accommodate fixed service quotas and constraints through architecture](rel_manage_service_limits_aware_fixed_limits.md) 
+  [REL01-BP04 Monitor and manage quotas](rel_manage_service_limits_monitor_manage_limits.md) 
+  [REL01-BP05 Automate quota management](rel_manage_service_limits_automated_monitor_limits.md) 
+  [REL01-BP06 Ensure that a sufficient gap exists between the current quotas and the maximum usage to accommodate failover](rel_manage_service_limits_suff_buffer_limits.md) 
+  [REL03-BP01 Choose how to segment your workload](rel_service_architecture_monolith_soa_microservice.md) 
+  [REL10-BP01 Deploy the workload to multiple locations](rel_fault_isolation_multiaz_region_system.md) 
+  [REL11-BP01 Monitor all components of the workload to detect failures](rel_withstand_component_failures_monitoring_health.md) 
+  [REL11-BP03 Automate healing on all layers](rel_withstand_component_failures_auto_healing_system.md) 
+  [REL12-BP04 Test resiliency using chaos engineering](rel_testing_resiliency_failure_injection_resiliency.md) 

 **Related documents:** 
+ [AWS Well-Architected Framework’s Reliability Pillar: Availability ](https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/availability.html)
+  [AWS Service Quotas (formerly referred to as service limits)](https://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html) 
+  [AWS Trusted Advisor Best Practice Checks (see the Service Limits section)](https://aws.amazon.com/premiumsupport/technology/trusted-advisor/best-practice-checklist/) 
+  [AWS limit monitor on AWS answers](https://aws.amazon.com/answers/account-management/limit-monitor/) 
+  [Amazon EC2 Service Limits](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-resource-limits.html) 
+  [What is Service Quotas?](https://docs.aws.amazon.com/servicequotas/latest/userguide/intro.html) 
+ [ How to Request Quota Increase ](https://docs.aws.amazon.com/servicequotas/latest/userguide/request-quota-increase.html)
+ [ Service endpoints and quotas ](https://docs.aws.amazon.com/general/latest/gr/aws-service-information.html)
+  [Service Quotas User Guide](https://docs.aws.amazon.com/servicequotas/latest/userguide/intro.html) 
+ [ Quota Monitor for AWS](https://aws.amazon.com/solutions/implementations/quota-monitor/)
+ [AWS Fault Isolation Boundaries ](https://docs.aws.amazon.com/whitepapers/latest/aws-fault-isolation-boundaries/abstract-and-introduction.html)
+ [ Availability with redundancy ](https://docs.aws.amazon.com/whitepapers/latest/availability-and-beyond-improving-resilience/availability-with-redundancy.html)
+ [AWS for Data ](https://aws.amazon.com/data/)
+ [ What is Continuous Integration? ](https://aws.amazon.com/devops/continuous-integration/)
+ [ What is Continuous Delivery? ](https://aws.amazon.com/devops/continuous-delivery/)
+ [ APN Partner: partners that can help with configuration management ](https://partners.amazonaws.com/search/partners?keyword=Configuration+Management&ref=wellarchitected)
+ [ Managing the account lifecycle in account-per-tenant SaaS environments on AWS](https://aws.amazon.com/blogs/mt/managing-the-account-lifecycle-in-account-per-tenant-saas-environments-on-aws/)
+ [ Managing and monitoring API throttling in your workloads ](https://aws.amazon.com/blogs/mt/managing-monitoring-api-throttling-in-workloads/)
+ [ View AWS Trusted Advisor recommendations at scale with AWS Organizations](https://aws.amazon.com/blogs/mt/organizational-view-for-trusted-advisor/)
+ [ Automating Service Limit Increases and Enterprise Support with AWS Control Tower](https://aws.amazon.com/blogs/mt/automating-service-limit-increases-enterprise-support-aws-control-tower/)

 **Related videos:** 
+  [AWS Live re:Inforce 2019 - Service Quotas](https://youtu.be/O9R5dWgtrVo) 
+ [ View and Manage Quotas for AWS Services Using Service Quotas ](https://www.youtube.com/watch?v=ZTwfIIf35Wc)
+ [AWS IAM Quotas Demo ](https://www.youtube.com/watch?v=srJ4jr6M9YQ)

 **Related tools:** 
+ [ Amazon CodeGuru Reviewer ](https://aws.amazon.com/codeguru/)
+ [AWS CodeDeploy](https://aws.amazon.com/codedeploy/)
+ [AWS CloudTrail](https://aws.amazon.com/cloudtrail/)
+ [ Amazon CloudWatch ](https://aws.amazon.com/cloudwatch/)
+ [ Amazon EventBridge ](https://aws.amazon.com/eventbridge/)
+ [ Amazon DevOps Guru ](https://aws.amazon.com/devops-guru/)
+ [AWS Config](https://aws.amazon.com/config/)
+ [AWS Trusted Advisor](https://aws.amazon.com/premiumsupport/technology/trusted-advisor/)
+ [AWS CDK ](https://aws.amazon.com/cdk/)
+ [AWS Systems Manager](https://aws.amazon.com/systems-manager/)
+ [AWS Marketplace](https://aws.amazon.com/marketplace/search/results?searchTerms=CMDB)

# REL01-BP02 Manage service quotas across accounts and regions
REL01-BP02 Manage service quotas across accounts and regions

 If you are using multiple accounts or Regions, request the appropriate quotas in all environments in which your production workloads run. 

 **Desired outcome:** Services and applications should not be affected by service quota exhaustion for configurations that span accounts or Regions or that have resilience designs using zone, Region, or account failover. 

 **Common anti-patterns:** 
+ Allowing resource usage in one isolation Region to grow with no mechanism to maintain capacity in the other ones. 
+  Manually setting all quotas independently in isolation Regions. 
+  Not considering the effect of resiliency architectures (like active or passive) in future quota needs during a degradation in the non-primary Region. 
+  Not evaluating quotas regularly and making necessary changes in every Region and account the workload runs. 
+  Not leveraging [quota request templates](https://docs.aws.amazon.com/servicequotas/latest/userguide/organization-templates.html) to request increases across multiple Regions and accounts. 
+  Not updating service quotas due to incorrectly thinking that increasing quotas has cost implications like compute reservation requests. 

 **Benefits of establishing this best practice:** Verifying that you can handle your current load in secondary regions or accounts if regional services become unavailable. This can help reduce the number of errors or levels of degradations that occur during region loss. 

 **Level of risk exposed if this best practice is not established:** High 

## Implementation guidance
Implementation guidance

 Service quotas are tracked per account. Unless otherwise noted, each quota is AWS Region-specific. In addition to the production environments, also manage quotas in all applicable non-production environments so that testing and development are not hindered. Maintaining a high degree of resiliency requires that service quotas are assessed continually (whether automated or manual). 

 With more workloads spanning Regions due to the implementation of designs using *Active/Active*, *Active/Passive – Hot*, *Active/Passive-Cold*, and *Active/Passive-Pilot Light* approaches, it is essential to understand all Region and account quota levels. Past traffic patterns are not always a good indicator if the service quota is set correctly. 

 Equally important, the service quota name limit is not always the same for every Region. In one Region, the value could be five, and in another region the value could be ten. Management of these quotas must span all the same services, accounts, and Regions to provide consistent resilience under load. 

 Reconcile all the service quota differences across different Regions (Active Region or Passive Region) and create processes to continually reconcile these differences. The testing plans of passive Region failovers are rarely scaled to peak active capacity, meaning that game day or table top exercises can fail to find differences in service quotas between Regions and also then maintain the correct limits. 

 *Service quota drift*, the condition where service quota limits for a specific named quota is changed in one Region and not all Regions, is very important to track and assess. Changing the quota in Regions with traffic or potentially could carry traffic should be considered. 
+  Select relevant accounts and Regions based on your service requirements, latency, regulatory, and disaster recovery (DR) requirements. 
+  Identify service quotas across all relevant accounts, Regions, and Availability Zones. The limits are scoped to account and Region. These values should be compared for differences. 

 **Implementation steps** 
+  Review Service Quotas values that might have breached beyond the a risk level of usage. AWS Trusted Advisor provides alerts for 80% and 90% threshold breaches. 
+  Review values for service quotas in any Passive Regions (in an Active/Passive design). Verify that load will successfully run in secondary Regions in the event of a failure in the primary Region. 
+  Automate assessing if any service quota drift has occurred between Regions in the same account and act accordingly to change the limits. 
+  If the customer Organizational Units (OU) are structured in the supported manner, service quota templates should be updated to reflect changes in any quotas that should be applied to multiple Regions and accounts. 
  +  Create a template and associate Regions to the quota change. 
  +  Review all existing service quota templates for any changes required (Region, limits, and accounts). 

## Resources
Resources

 **Related best practices:** 
+  [REL01-BP01 Aware of service quotas and constraints](rel_manage_service_limits_aware_quotas_and_constraints.md) 
+  [REL01-BP03 Accommodate fixed service quotas and constraints through architecture](rel_manage_service_limits_aware_fixed_limits.md) 
+  [REL01-BP04 Monitor and manage quotas](rel_manage_service_limits_monitor_manage_limits.md) 
+  [REL01-BP05 Automate quota management](rel_manage_service_limits_automated_monitor_limits.md) 
+  [REL01-BP06 Ensure that a sufficient gap exists between the current quotas and the maximum usage to accommodate failover](rel_manage_service_limits_suff_buffer_limits.md) 
+  [REL03-BP01 Choose how to segment your workload](rel_service_architecture_monolith_soa_microservice.md) 
+  [REL10-BP01 Deploy the workload to multiple locations](rel_fault_isolation_multiaz_region_system.md) 
+  [REL11-BP01 Monitor all components of the workload to detect failures](rel_withstand_component_failures_monitoring_health.md) 
+  [REL11-BP03 Automate healing on all layers](rel_withstand_component_failures_auto_healing_system.md) 
+  [REL12-BP04 Test resiliency using chaos engineering](rel_testing_resiliency_failure_injection_resiliency.md) 

 **Related documents:** 
+ [AWS Well-Architected Framework’s Reliability Pillar: Availability ](https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/availability.html)
+  [AWS Service Quotas (formerly referred to as service limits)](https://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html) 
+  [AWS Trusted Advisor Best Practice Checks (see the Service Limits section)](https://aws.amazon.com/premiumsupport/technology/trusted-advisor/best-practice-checklist/) 
+  [AWS limit monitor on AWS answers](https://aws.amazon.com/answers/account-management/limit-monitor/) 
+  [Amazon EC2 Service Limits](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-resource-limits.html) 
+  [What is Service Quotas?](https://docs.aws.amazon.com/servicequotas/latest/userguide/intro.html) 
+ [ How to Request Quota Increase ](https://docs.aws.amazon.com/servicequotas/latest/userguide/request-quota-increase.html)
+ [ Service endpoints and quotas ](https://docs.aws.amazon.com/general/latest/gr/aws-service-information.html)
+  [Service Quotas User Guide](https://docs.aws.amazon.com/servicequotas/latest/userguide/intro.html) 
+ [ Quota Monitor for AWS](https://aws.amazon.com/solutions/implementations/quota-monitor/)
+ [AWS Fault Isolation Boundaries ](https://docs.aws.amazon.com/whitepapers/latest/aws-fault-isolation-boundaries/abstract-and-introduction.html)
+ [ Availability with redundancy ](https://docs.aws.amazon.com/whitepapers/latest/availability-and-beyond-improving-resilience/availability-with-redundancy.html)
+ [AWS for Data ](https://aws.amazon.com/data/)
+ [ What is Continuous Integration? ](https://aws.amazon.com/devops/continuous-integration/)
+ [ What is Continuous Delivery? ](https://aws.amazon.com/devops/continuous-delivery/)
+ [ APN Partner: partners that can help with configuration management ](https://partners.amazonaws.com/search/partners?keyword=Configuration+Management&ref=wellarchitected)
+ [ Managing the account lifecycle in account-per-tenant SaaS environments on AWS](https://aws.amazon.com/blogs/mt/managing-the-account-lifecycle-in-account-per-tenant-saas-environments-on-aws/)
+ [ Managing and monitoring API throttling in your workloads ](https://aws.amazon.com/blogs/mt/managing-monitoring-api-throttling-in-workloads/)
+ [ View AWS Trusted Advisor recommendations at scale with AWS Organizations](https://aws.amazon.com/blogs/mt/organizational-view-for-trusted-advisor/)
+ [ Automating Service Limit Increases and Enterprise Support with AWS Control Tower](https://aws.amazon.com/blogs/mt/automating-service-limit-increases-enterprise-support-aws-control-tower/)

 **Related videos:** 
+  [AWS Live re:Inforce 2019 - Service Quotas](https://youtu.be/O9R5dWgtrVo) 
+ [ View and Manage Quotas for AWS Services Using Service Quotas ](https://www.youtube.com/watch?v=ZTwfIIf35Wc)
+ [AWS IAM Quotas Demo ](https://www.youtube.com/watch?v=srJ4jr6M9YQ)

 **Related services:** 
+ [ Amazon CodeGuru Reviewer ](https://aws.amazon.com/codeguru/)
+ [AWS CodeDeploy](https://aws.amazon.com/codedeploy/)
+ [AWS CloudTrail](https://aws.amazon.com/cloudtrail/)
+ [ Amazon CloudWatch ](https://aws.amazon.com/cloudwatch/)
+ [ Amazon EventBridge ](https://aws.amazon.com/eventbridge/)
+ [ Amazon DevOps Guru ](https://aws.amazon.com/devops-guru/)
+ [AWS Config](https://aws.amazon.com/config/)
+ [AWS Trusted Advisor](https://aws.amazon.com/premiumsupport/technology/trusted-advisor/)
+ [AWS CDK ](https://aws.amazon.com/cdk/)
+ [AWS Systems Manager](https://aws.amazon.com/systems-manager/)
+ [AWS Marketplace](https://aws.amazon.com/marketplace/search/results?searchTerms=CMDB)

# REL01-BP03 Accommodate fixed service quotas and constraints through architecture
REL01-BP03 Accommodate fixed service quotas and constraints through architecture

Be aware of unchangeable service quotas, service constraints, and physical resource limits. Design architectures for applications and services to prevent these limits from impacting reliability.

Examples include network bandwidth, serverless function invocation payload size, throttle burst rate for of an API gateway, and concurrent user connections to a database.

 **Desired outcome:** The application or service performs as expected under normal and high traffic conditions. They have been designed to work within the limitations for that resource’s fixed constraints or service quotas. 

 **Common anti-patterns:** 
+ Choosing a design that uses a resource of a service, unaware that there are design constraints that will cause this design to fail as you scale.
+ Performing benchmarking that is unrealistic and will reach service fixed quotas during the testing. For example, running tests at a burst limit but for an extended amount of time.
+  Choosing a design that cannot scale or be modified if fixed service quotas are to be exceeded. For example, an SQS payload size of 256KB. 
+  Observability has not been designed and implemented to monitor and alert on thresholds for service quotas that might be at risk during high traffic events 

 **Benefits of establishing this best practice:** Verifying that the application will run under all projected services load levels without disruption or degradation. 

 **Level of risk exposed if this best practice is not established:** Medium 

## Implementation guidance
Implementation guidance

 Unlike soft service quotas or resources that be replaced with higher capacity units, AWS services’ fixed quotas cannot be changed. This means that all these type of AWS services must be evaluated for potential hard capacity limits when used in an application design. 

 Hard limits are shown in the Service Quotas console. If the columns shows `ADJUSTABLE = No`, the service has a hard limit. Hard limits are also shown in some resources configuration pages. For example, Lambda has specific hard limits that cannot be adjusted. 

 As an example, when designing a python application to run in a Lambda function, the application should be evaluated to determine if there is any chance of Lambda running longer than 15 minutes. If the code may run more than this service quota limit, alternate technologies or designs must be considered. If this limit is reached after production deployment, the application will suffer degradation and disruption until it can be remediated. Unlike soft quotas, there is no method to change to these limits even under emergency Severity 1 events. 

 Once the application has been deployed to a testing environment, strategies should be used to find if any hard limits can be reached. Stress testing, load testing, and chaos testing should be part of the introduction test plan. 

 **Implementation steps** 
+  Review the complete list of AWS services that could be used in the application design phase. 
+  Review the soft quota limits and hard quota limits for all these services. Not all limits are shown in the Service Quotas console. Some services [describe these limits in alternate locations](https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html). 
+  As you design your application, review your workload’s business and technology drivers, such as business outcomes, use case, dependent systems, availability targets, and disaster recovery objects. Let your business and technology drivers guide the process to identify the distributed system that is right for your workload. 
+  Analyze service load across Regions and accounts. Many hard limits are regionally based for services. However, some limits are account based. 
+  Analyze resilience architectures for resource usage during a zonal failure and Regional failure. In the progression of multi-Region designs using active/active, active/passive – hot, active/passive - cold, and active/passive - pilot light approaches, these failure cases will cause higher usage. This creates a potential use case for hitting hard limits. 

## Resources
Resources

 **Related best practices:** 
+  [REL01-BP01 Aware of service quotas and constraints](rel_manage_service_limits_aware_quotas_and_constraints.md) 
+  [REL01-BP02 Manage service quotas across accounts and regions](rel_manage_service_limits_limits_considered.md) 
+  [REL01-BP04 Monitor and manage quotas](rel_manage_service_limits_monitor_manage_limits.md) 
+  [REL01-BP05 Automate quota management](rel_manage_service_limits_automated_monitor_limits.md) 
+  [REL01-BP06 Ensure that a sufficient gap exists between the current quotas and the maximum usage to accommodate failover](rel_manage_service_limits_suff_buffer_limits.md) 
+  [REL03-BP01 Choose how to segment your workload](rel_service_architecture_monolith_soa_microservice.md) 
+  [REL10-BP01 Deploy the workload to multiple locations](rel_fault_isolation_multiaz_region_system.md) 
+  [REL11-BP01 Monitor all components of the workload to detect failures](rel_withstand_component_failures_monitoring_health.md) 
+  [REL11-BP03 Automate healing on all layers](rel_withstand_component_failures_auto_healing_system.md) 
+  [REL12-BP04 Test resiliency using chaos engineering](rel_testing_resiliency_failure_injection_resiliency.md) 

 **Related documents:** 
+ [AWS Well-Architected Framework’s Reliability Pillar: Availability ](https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/availability.html)
+  [AWS Service Quotas (formerly referred to as service limits)](https://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html) 
+  [AWS Trusted Advisor Best Practice Checks (see the Service Limits section)](https://aws.amazon.com/premiumsupport/technology/trusted-advisor/best-practice-checklist/) 
+  [AWS limit monitor on AWS answers](https://aws.amazon.com/answers/account-management/limit-monitor/) 
+  [Amazon EC2 Service Limits](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-resource-limits.html) 
+  [What is Service Quotas?](https://docs.aws.amazon.com/servicequotas/latest/userguide/intro.html) 
+ [ How to Request Quota Increase ](https://docs.aws.amazon.com/servicequotas/latest/userguide/request-quota-increase.html)
+ [ Service endpoints and quotas ](https://docs.aws.amazon.com/general/latest/gr/aws-service-information.html)
+  [Service Quotas User Guide](https://docs.aws.amazon.com/servicequotas/latest/userguide/intro.html) 
+ [ Quota Monitor for AWS](https://aws.amazon.com/solutions/implementations/quota-monitor/)
+ [AWS Fault Isolation Boundaries ](https://docs.aws.amazon.com/whitepapers/latest/aws-fault-isolation-boundaries/abstract-and-introduction.html)
+ [ Availability with redundancy ](https://docs.aws.amazon.com/whitepapers/latest/availability-and-beyond-improving-resilience/availability-with-redundancy.html)
+ [AWS for Data ](https://aws.amazon.com/data/)
+ [ What is Continuous Integration? ](https://aws.amazon.com/devops/continuous-integration/)
+ [ What is Continuous Delivery? ](https://aws.amazon.com/devops/continuous-delivery/)
+ [ APN Partner: partners that can help with configuration management ](https://partners.amazonaws.com/search/partners?keyword=Configuration+Management&ref=wellarchitected)
+ [ Managing the account lifecycle in account-per-tenant SaaS environments on AWS](https://aws.amazon.com/blogs/mt/managing-the-account-lifecycle-in-account-per-tenant-saas-environments-on-aws/)
+ [ Managing and monitoring API throttling in your workloads ](https://aws.amazon.com/blogs/mt/managing-monitoring-api-throttling-in-workloads/)
+ [ View AWS Trusted Advisor recommendations at scale with AWS Organizations](https://aws.amazon.com/blogs/mt/organizational-view-for-trusted-advisor/)
+ [ Automating Service Limit Increases and Enterprise Support with AWS Control Tower](https://aws.amazon.com/blogs/mt/automating-service-limit-increases-enterprise-support-aws-control-tower/)
+ [ Actions, resources, and condition keys for Service Quotas ](https://docs.aws.amazon.com/service-authorization/latest/reference/list_servicequotas.html)

 **Related videos:** 
+  [AWS Live re:Inforce 2019 - Service Quotas](https://youtu.be/O9R5dWgtrVo) 
+ [ View and Manage Quotas for AWS Services Using Service Quotas ](https://www.youtube.com/watch?v=ZTwfIIf35Wc)
+ [AWS IAM Quotas Demo ](https://www.youtube.com/watch?v=srJ4jr6M9YQ)
+ [AWS re:Invent 2018: Close Loops and Opening Minds: How to Take Control of Systems, Big and Small ](https://www.youtube.com/watch?v=O8xLxNje30M)

 **Related tools:** 
+ [AWS CodeDeploy](https://aws.amazon.com/codedeploy/)
+ [AWS CloudTrail](https://aws.amazon.com/cloudtrail/)
+ [ Amazon CloudWatch ](https://aws.amazon.com/cloudwatch/)
+ [ Amazon EventBridge ](https://aws.amazon.com/eventbridge/)
+ [ Amazon DevOps Guru ](https://aws.amazon.com/devops-guru/)
+ [AWS Config](https://aws.amazon.com/config/)
+ [AWS Trusted Advisor](https://aws.amazon.com/premiumsupport/technology/trusted-advisor/)
+ [AWS CDK ](https://aws.amazon.com/cdk/)
+ [AWS Systems Manager](https://aws.amazon.com/systems-manager/)
+ [AWS Marketplace](https://aws.amazon.com/marketplace/search/results?searchTerms=CMDB)

# REL01-BP04 Monitor and manage quotas
REL01-BP04 Monitor and manage quotas

 Evaluate your potential usage and increase your quotas appropriately, allowing for planned growth in usage. 

 **Desired outcome:** Active and automated systems that manage and monitor have been deployed. These operations solutions ensure that quota usage thresholds are nearing being reached. These would be proactively remediated by requested quota changes. 

 **Common anti-patterns:** 
+ Not configuring monitoring to check for service quota thresholds
+ Not configuring monitoring for hard limits, even though those values cannot be changed.
+  Assuming that amount of time required to request and secure a soft quota change is immediate or a short period. 
+  Configuring alarms for when service quotas are being approached, but having no process on how to respond to an alert. 
+  Only configuring alarms for services supported by AWS Service Quotas and not monitoring other AWS services. 
+  Not considering quota management for multiple Region resiliency designs, like active/active, active/passive – hot, active/passive - cold, and active/passive - pilot light approaches. 
+  Not assessing quota differences between Regions. 
+  Not assessing the needs in every Region for a specific quota increase request. 
+  Not leveraging [templates for multi-Region quota management](https://docs.aws.amazon.com/servicequotas/latest/userguide/organization-templates.html). 

 **Benefits of establishing this best practice:** Automatic tracking of the AWS Service Quotas and monitoring your usage against those quotas will allow you to see when you are approaching a quota limit. You can also use this monitoring data to help limit any degradations due to quota exhaustion. 

 **Level of risk exposed if this best practice is not established:** Medium 

## Implementation guidance
Implementation guidance

 For supported services, you can monitor your quotas by configuring various different services that can assess and then send alerts or alarms. This can aid in monitoring usage and can alert you to approaching quotas. These alarms can be invoked from AWS Config, Lambda functions, Amazon CloudWatch, or from AWS Trusted Advisor. You can also use metric filters on CloudWatch Logs to search and extract patterns in logs to determine if usage is approaching quota thresholds. 

 **Implementation steps** 

 For monitoring: 
+  Capture current resource consumption (for example, buckets or instances). Use service API operations, such as the Amazon EC2 `DescribeInstances` API, to collect current resource consumption. 
+  Capture your current quotas that are essential and applicable to the services using: 
  +  AWS Service Quotas 
  +  AWS Trusted Advisor 
  +  AWS documentation 
  +  AWS service-specific pages 
  +  AWS Command Line Interface (AWS CLI) 
  +  AWS Cloud Development Kit (AWS CDK) 
+  Use AWS Service Quotas, an AWS service that helps you manage your quotas for over 250 AWS services from one location. 
+  Use Trusted Advisor service limits to monitor your current service limits at various thresholds. 
+  Use the service quota history (console or AWS CLI) to check on regional increases. 
+  Compare service quota changes in each Region and each account to create equivalency, if required. 

 For management: 
+  Automated: Set up an AWS Config custom rule to scan service quotas across Regions and compare for differences. 
+  Automated: Set up a scheduled Lambda function to scan service quotas across Regions and compare for differences. 
+  Manual: Scan services quota through AWS CLI, API, or AWS Console to scan service quotas across Regions and compare for differences. Report the differences. 
+  If differences in quotas are identified between Regions, request a quota change, if required. 
+  Review the result of all requests. 

## Resources
Resources

 **Related best practices:** 
+  [REL01-BP01 Aware of service quotas and constraints](rel_manage_service_limits_aware_quotas_and_constraints.md) 
+  [REL01-BP02 Manage service quotas across accounts and regions](rel_manage_service_limits_limits_considered.md) 
+  [REL01-BP03 Accommodate fixed service quotas and constraints through architecture](rel_manage_service_limits_aware_fixed_limits.md) 
+  [REL01-BP05 Automate quota management](rel_manage_service_limits_automated_monitor_limits.md) 
+  [REL01-BP06 Ensure that a sufficient gap exists between the current quotas and the maximum usage to accommodate failover](rel_manage_service_limits_suff_buffer_limits.md) 
+  [REL03-BP01 Choose how to segment your workload](rel_service_architecture_monolith_soa_microservice.md) 
+  [REL10-BP01 Deploy the workload to multiple locations](rel_fault_isolation_multiaz_region_system.md) 
+  [REL11-BP01 Monitor all components of the workload to detect failures](rel_withstand_component_failures_monitoring_health.md) 
+  [REL11-BP03 Automate healing on all layers](rel_withstand_component_failures_auto_healing_system.md) 
+  [REL12-BP04 Test resiliency using chaos engineering](rel_testing_resiliency_failure_injection_resiliency.md) 

 **Related documents:** 
+ [AWS Well-Architected Framework’s Reliability Pillar: Availability ](https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/availability.html)
+  [AWS Service Quotas (formerly referred to as service limits)](https://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html) 
+  [AWS Trusted Advisor Best Practice Checks (see the Service Limits section)](https://aws.amazon.com/premiumsupport/technology/trusted-advisor/best-practice-checklist/) 
+  [AWS limit monitor on AWS answers](https://aws.amazon.com/answers/account-management/limit-monitor/) 
+  [Amazon EC2 Service Limits](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-resource-limits.html) 
+  [What is Service Quotas?](https://docs.aws.amazon.com/servicequotas/latest/userguide/intro.html) 
+ [ How to Request Quota Increase ](https://docs.aws.amazon.com/servicequotas/latest/userguide/request-quota-increase.html)
+ [ Service endpoints and quotas ](https://docs.aws.amazon.com/general/latest/gr/aws-service-information.html)
+  [Service Quotas User Guide](https://docs.aws.amazon.com/servicequotas/latest/userguide/intro.html) 
+ [ Quota Monitor for AWS](https://aws.amazon.com/solutions/implementations/quota-monitor/)
+ [AWS Fault Isolation Boundaries ](https://docs.aws.amazon.com/whitepapers/latest/aws-fault-isolation-boundaries/abstract-and-introduction.html)
+ [ Availability with redundancy ](https://docs.aws.amazon.com/whitepapers/latest/availability-and-beyond-improving-resilience/availability-with-redundancy.html)
+ [AWS for Data ](https://aws.amazon.com/data/)
+ [ What is Continuous Integration? ](https://aws.amazon.com/devops/continuous-integration/)
+ [ What is Continuous Delivery? ](https://aws.amazon.com/devops/continuous-delivery/)
+ [ APN Partner: partners that can help with configuration management ](https://partners.amazonaws.com/search/partners?keyword=Configuration+Management&ref=wellarchitected)
+ [ Managing the account lifecycle in account-per-tenant SaaS environments on AWS](https://aws.amazon.com/blogs/mt/managing-the-account-lifecycle-in-account-per-tenant-saas-environments-on-aws/)
+ [ Managing and monitoring API throttling in your workloads ](https://aws.amazon.com/blogs/mt/managing-monitoring-api-throttling-in-workloads/)
+ [ View AWS Trusted Advisor recommendations at scale with AWS Organizations](https://aws.amazon.com/blogs/mt/organizational-view-for-trusted-advisor/)
+ [ Automating Service Limit Increases and Enterprise Support with AWS Control Tower](https://aws.amazon.com/blogs/mt/automating-service-limit-increases-enterprise-support-aws-control-tower/)
+ [ Actions, resources, and condition keys for Service Quotas ](https://docs.aws.amazon.com/service-authorization/latest/reference/list_servicequotas.html)

 **Related videos:** 
+  [AWS Live re:Inforce 2019 - Service Quotas](https://youtu.be/O9R5dWgtrVo) 
+ [ View and Manage Quotas for AWS Services Using Service Quotas ](https://www.youtube.com/watch?v=ZTwfIIf35Wc)
+ [AWS IAM Quotas Demo ](https://www.youtube.com/watch?v=srJ4jr6M9YQ)
+ [AWS re:Invent 2018: Close Loops and Opening Minds: How to Take Control of Systems, Big and Small ](https://www.youtube.com/watch?v=O8xLxNje30M)

 **Related tools:** 
+ [AWS CodeDeploy](https://aws.amazon.com/codedeploy/)
+ [AWS CloudTrail](https://aws.amazon.com/cloudtrail/)
+ [ Amazon CloudWatch ](https://aws.amazon.com/cloudwatch/)
+ [ Amazon EventBridge ](https://aws.amazon.com/eventbridge/)
+ [ Amazon DevOps Guru ](https://aws.amazon.com/devops-guru/)
+ [AWS Config](https://aws.amazon.com/config/)
+ [AWS Trusted Advisor](https://aws.amazon.com/premiumsupport/technology/trusted-advisor/)
+ [AWS CDK ](https://aws.amazon.com/cdk/)
+ [AWS Systems Manager](https://aws.amazon.com/systems-manager/)
+ [AWS Marketplace](https://aws.amazon.com/marketplace/search/results?searchTerms=CMDB)

# REL01-BP05 Automate quota management
REL01-BP05 Automate quota management

 Service quotas, also referred to as limits in AWS services, are the maximum values for the resources in your AWS account. Each AWS service defines a set of quotas and their default values. To provide your workload access to all the resources it needs, you might need to increase your service quota values. 

 Growth in workload consumption of AWS resources can threaten workload stability and impact the user experience if quotas are exceeded. Implement tools to alert you when your workload approaches the limits and consider creating quota increase requests automatically. 

 **Desired outcome:** Quotas are appropriately configured for the workloads running in each AWS account and Region. 

 **Common anti-patterns:** 
+  You fail to consider and adjust quotas appropriately to meet workload requirements. 
+  You track quotas and usage using methods that can become outdated, such as with spreadsheets. 
+  You only update service limits on periodic schedules. 
+  Your organization lacks operational processes to review existing quotas and request service quota increases when necessary. 

 **Benefits of establishing this best practice:** 
+  Enhanced workload resiliency: You prevent errors caused by exceeding AWS resource quotas. 
+  Simplified disaster recovery: You can reuse automated quota management mechanisms built in the primary Region during DR setup in another AWS Region. 

 **Level of risk exposed if this best practice is not established:** Medium 

## Implementation guidance
Implementation guidance

 View current quotas and track ongoing quota consumption through mechanisms such as AWS Service Quotas console, AWS Command Line Interface (AWS CLI), and AWS SDKs. You can also integrate your configuration management databases (CMDB) and IT service management (ITSM) systems with the AWS Service Quota APIs. 

 Generate automated alerts if quota usage reaches your defined thresholds, and define a process for submitting quota increase requests when you receive alerts. If the underlying workload is critical to your business, you can automate quota increase requests, but carefully test the automation to avoid the risk of runaway action such as a growth feedback loop. 

 Smaller quota increases are often automatically approved. Larger quota requests may need to be manually processed by AWS support and can take additional time to review and process. Allow for additional time to process multiple requests or large increase requests. 

### Implementation steps
Implementation steps
+  Implement automated monitoring of service quotas, and issue alerts if your workload's resource utilization growth approaches your quota limits. For example, [Quota Monitor](https://docs.aws.amazon.com/solutions/latest/quota-monitor-for-aws/solution-overview.html) for AWS can provide automated monitoring of service quotas. This tool integrates with AWS Organizations and deploys using Cloudformation StackSets so that new accounts are automatically monitored on creation. 
+  Use features such as [Service Quotas request templates](https://docs.aws.amazon.com/servicequotas/latest/userguide/organization-templates.html) or [AWS Control Tower](https://www.youtube.com/watch?v=3WUShZ4lZGE) to simplify Service Quotas setup for new accounts. 
+  Build dashboards of your current service quota use across all AWS accounts and regions and reference them as necessary to prevent exceeding your quotas. [Trusted Advisor Organizational (TAO) Dashboard](https://aws.amazon.com/blogs/mt/a-detailed-overview-of-trusted-advisor-organizational-dashboard/), part of the [Cloud Intelligence Dashboards](https://catalog.workshops.aws/awscid/en-US), can get you quickly started with such a dashboard. 
+  Track service limit increase requests. [Consolidated Insights from Multiple Accounts(CIMA)](https://github.com/aws-samples/case-insights-for-multi-accounts) can provide an Organization-level view of all your requests. 
+  Test alert generation and any quota increase request automation by setting lower quota thresholds in non-production accounts. Do not conduct these tests in a production account. 

## Resources
Resources

 **Related best practices:** 
+  [OPS10-BP07 Automate responses to events](https://docs.aws.amazon.com/wellarchitected/latest/operational-excellence-pillar/ops_event_response_auto_event_response.html) 

 **Related documents:** 
+  [APN Partner: partners that can help with configuration management](https://aws.amazon.com/partners/find/results/?keyword=Configuration+Management) 
+  [AWS Marketplace: CMDB products that help track limits](https://aws.amazon.com/marketplace/search/results?searchTerms=CMDB) 
+  [AWS Service Quotas (formerly referred to as service limits)](https://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html) 
+  [AWS Trusted Advisor Best Practice Checks (see the Service Limits section)](https://docs.aws.amazon.com/awssupport/latest/user/trusted-advisor-check-reference.html) 
+  [Quota Monitor Solution on AWS - AWS Solution](https://aws.amazon.com/answers/account-management/limit-monitor/) 
+  [What is Service Quotas?](https://docs.aws.amazon.com/servicequotas/latest/userguide/intro.html) 
+  [What is Service Quotas request templates?](https://docs.aws.amazon.com/servicequotas/latest/userguide/intro.html) 

 **Related videos:** 
+  [AWS Live re:Inforce 2019 - Service Quotas](https://youtu.be/O9R5dWgtrVo) 
+  [Automating Service Limit Increases and Enterprise Support with AWS Control Tower](https://www.youtube.com/watch?v=3WUShZ4lZGE) 

 **Related tools:** 
+  [Quota Monitor for AWS](https://github.com/aws-solutions/quota-monitor-for-aws) 

# REL01-BP06 Ensure that a sufficient gap exists between the current quotas and the maximum usage to accommodate failover
REL01-BP06 Ensure that a sufficient gap exists between the current quotas and the maximum usage to accommodate failover

This article explains how to maintain space between the resource quota and your usage, and how it can benefit your organization. After you finish using a resource, the usage quota may continue to account for that resource. This can result in a failing or inaccessible resource. Prevent resource failure by verifying that your quotas cover the overlap of inaccessible resources and their replacements. Consider cases like network failure, Availability Zone failure, or Region failures when calculating this gap.

 **Desired outcome:** Small or large failures in resources or resource accessibility can be covered within the current service thresholds. Zone failures, network failures, or even Regional failures have been considered in the resource planning. 

 **Common anti-patterns:** 
+  Setting service quotas based on current needs without accounting for failover scenarios. 
+  Not considering the principals of static stability when calculating the peak quota for a service. 
+  Not considering the potential of inaccessible resources in calculating total quota needed for each Region. 
+  Not considering AWS service fault isolation boundaries for some services and their potential abnormal usage patterns. 

 **Benefits of establishing this best practice:** When service disruption events impact application availability, use the cloud to implement strategies to recover from these events. An example strategy is creating additional resources to replace inaccessible resources to accommodate failover conditions without exhausting your service limit. 

 **Level of risk exposed if this best practice is not established:** Medium 

## Implementation guidance
Implementation guidance

 When evaluating a quota limit, consider failover cases that might occur due to some degradation. Consider the following failover cases. 
+  A disrupted or inaccessible VPC. 
+  An inaccessible subnet. 
+  A degraded Availability Zone that impacts resource accessibility. 
+  Networking routes or ingress and egress points are blocked or changed. 
+  A degraded Region that impacts resource accessibility. 
+  A subset of resources affected by a failure in a Region or an Availability Zone. 

 The decision to failover is unique for each situation, as the business impact can vary. Address resource capacity planning in the failover location and the resources’ quotas before deciding to failover an application or service. 

 Consider higher than normal peaks of activity when reviewing quotas for each service. These peaks might be related to resources that are inaccessible due to networking or permissions, but are still active. Unterminated active resources count against the service quota limit. 

 **Implementation steps** 
+  Maintain space between your service quota and your maximum usage to accommodate for a failover or loss of accessibility. 
+  Determine your service quotas. Account for typical deployment patterns, availability requirements, and consumption growth. 
+  Request quota increases if necessary. Anticipate a wait time for the quota increase request. 
+  Determine your reliability requirements (also known as your number of nines). 
+  Understand potential fault scenarios such as loss of a component, an Availability Zone, or a Region. 
+  Establish your deployment methodology (examples include canary, blue/green, red/black, and rolling). 
+  Include an appropriate buffer to the current quota limit. An example buffer could be 15%. 
+  Include calculations for static stability (Zonal and Regional) where appropriate. 
+  Plan consumption growth and monitor your consumption trends. 
+  Consider the static stability impact for your most critical workloads. Assess resources conforming to a statically stable system in all Regions and Availability Zones. 
+  Consider using On-Demand Capacity Reservations to schedule capacity ahead of any failover. This is a useful strategy to implement for critical business schedules to reduce potential risks of obtaining the correct quantity and type of resources during failover. 

## Resources
Resources

 **Related best practices:** 
+  [REL01-BP01 Aware of service quotas and constraints](rel_manage_service_limits_aware_quotas_and_constraints.md) 
+  [REL01-BP02 Manage service quotas across accounts and regions](rel_manage_service_limits_limits_considered.md) 
+  [REL01-BP03 Accommodate fixed service quotas and constraints through architecture](rel_manage_service_limits_aware_fixed_limits.md) 
+  [REL01-BP04 Monitor and manage quotas](rel_manage_service_limits_monitor_manage_limits.md) 
+  [REL01-BP05 Automate quota management](rel_manage_service_limits_automated_monitor_limits.md) 
+  [REL03-BP01 Choose how to segment your workload](rel_service_architecture_monolith_soa_microservice.md) 
+  [REL10-BP01 Deploy the workload to multiple locations](rel_fault_isolation_multiaz_region_system.md) 
+  [REL11-BP01 Monitor all components of the workload to detect failures](rel_withstand_component_failures_monitoring_health.md) 
+  [REL11-BP03 Automate healing on all layers](rel_withstand_component_failures_auto_healing_system.md) 
+  [REL12-BP04 Test resiliency using chaos engineering](rel_testing_resiliency_failure_injection_resiliency.md) 

 **Related documents:** 
+ [AWS Well-Architected Framework’s Reliability Pillar: Availability ](https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/availability.html)
+  [AWS Service Quotas (formerly referred to as service limits)](https://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html) 
+  [AWS Trusted Advisor Best Practice Checks (see the Service Limits section)](https://aws.amazon.com/premiumsupport/technology/trusted-advisor/best-practice-checklist/) 
+  [AWS limit monitor on AWS answers](https://aws.amazon.com/answers/account-management/limit-monitor/) 
+  [Amazon EC2 Service Limits](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-resource-limits.html) 
+  [What is Service Quotas?](https://docs.aws.amazon.com/servicequotas/latest/userguide/intro.html) 
+ [ How to Request Quota Increase ](https://docs.aws.amazon.com/servicequotas/latest/userguide/request-quota-increase.html)
+ [ Service endpoints and quotas ](https://docs.aws.amazon.com/general/latest/gr/aws-service-information.html)
+  [Service Quotas User Guide](https://docs.aws.amazon.com/servicequotas/latest/userguide/intro.html) 
+ [ Quota Monitor for AWS](https://aws.amazon.com/solutions/implementations/quota-monitor/)
+ [AWS Fault Isolation Boundaries ](https://docs.aws.amazon.com/whitepapers/latest/aws-fault-isolation-boundaries/abstract-and-introduction.html)
+ [ Availability with redundancy ](https://docs.aws.amazon.com/whitepapers/latest/availability-and-beyond-improving-resilience/availability-with-redundancy.html)
+ [AWS for Data ](https://aws.amazon.com/data/)
+ [ What is Continuous Integration? ](https://aws.amazon.com/devops/continuous-integration/)
+ [ What is Continuous Delivery? ](https://aws.amazon.com/devops/continuous-delivery/)
+ [ APN Partner: partners that can help with configuration management ](https://partners.amazonaws.com/search/partners?keyword=Configuration+Management&ref=wellarchitected)
+ [ Managing the account lifecycle in account-per-tenant SaaS environments on AWS](https://aws.amazon.com/blogs/mt/managing-the-account-lifecycle-in-account-per-tenant-saas-environments-on-aws/)
+ [ Managing and monitoring API throttling in your workloads ](https://aws.amazon.com/blogs/mt/managing-monitoring-api-throttling-in-workloads/)
+ [ View AWS Trusted Advisor recommendations at scale with AWS Organizations](https://aws.amazon.com/blogs/mt/organizational-view-for-trusted-advisor/)
+ [ Automating Service Limit Increases and Enterprise Support with AWS Control Tower](https://aws.amazon.com/blogs/mt/automating-service-limit-increases-enterprise-support-aws-control-tower/)
+ [ Actions, resources, and condition keys for Service Quotas ](https://docs.aws.amazon.com/service-authorization/latest/reference/list_servicequotas.html)

 **Related videos:** 
+  [AWS Live re:Inforce 2019 - Service Quotas](https://youtu.be/O9R5dWgtrVo) 
+ [ View and Manage Quotas for AWS Services Using Service Quotas ](https://www.youtube.com/watch?v=ZTwfIIf35Wc)
+ [AWS IAM Quotas Demo ](https://www.youtube.com/watch?v=srJ4jr6M9YQ)
+ [AWS re:Invent 2018: Close Loops and Opening Minds: How to Take Control of Systems, Big and Small ](https://www.youtube.com/watch?v=O8xLxNje30M)

 **Related tools:** 
+ [AWS CodeDeploy](https://aws.amazon.com/codedeploy/)
+ [AWS CloudTrail](https://aws.amazon.com/cloudtrail/)
+ [ Amazon CloudWatch ](https://aws.amazon.com/cloudwatch/)
+ [ Amazon EventBridge ](https://aws.amazon.com/eventbridge/)
+ [ Amazon DevOps Guru ](https://aws.amazon.com/devops-guru/)
+ [AWS Config](https://aws.amazon.com/config/)
+ [AWS Trusted Advisor](https://aws.amazon.com/premiumsupport/technology/trusted-advisor/)
+ [AWS CDK ](https://aws.amazon.com/cdk/)
+ [AWS Systems Manager](https://aws.amazon.com/systems-manager/)
+ [AWS Marketplace](https://aws.amazon.com/marketplace/search/results?searchTerms=CMDB)

# Plan your network topology
Plan your network topology

 Workloads often exist in multiple environments. These include multiple cloud environments (both publicly accessible and private) and possibly your existing data center infrastructure. Plans must include network considerations, such as intrasystem and intersystem connectivity, public IP address management, private IP address management, and domain name resolution. 

 When architecting systems using IP address-based networks, you must plan network topology and addressing in anticipation of possible failures, and to accommodate future growth and integration with other systems and their networks. 

 Amazon Virtual Private Cloud (Amazon VPC) lets you provision a private, isolated section of the AWS Cloud where you can launch AWS resources in a virtual network. 

**Topics**
+ [

# REL02-BP01 Use highly available network connectivity for your workload public endpoints
](rel_planning_network_topology_ha_conn_users.md)
+ [

# REL02-BP02 Provision redundant connectivity between private networks in the cloud and on-premises environments
](rel_planning_network_topology_ha_conn_private_networks.md)
+ [

# REL02-BP03 Ensure IP subnet allocation accounts for expansion and availability
](rel_planning_network_topology_ip_subnet_allocation.md)
+ [

# REL02-BP04 Prefer hub-and-spoke topologies over many-to-many mesh
](rel_planning_network_topology_prefer_hub_and_spoke.md)
+ [

# REL02-BP05 Enforce non-overlapping private IP address ranges in all private address spaces where they are connected
](rel_planning_network_topology_non_overlap_ip.md)

# REL02-BP01 Use highly available network connectivity for your workload public endpoints
REL02-BP01 Use highly available network connectivity for your workload public endpoints

 Building highly available network connectivity to public endpoints of your workloads can help you reduce downtime due to loss of connectivity and improve the availability and SLA of your workload. To achieve this, use highly available DNS, content delivery networks (CDNs), API gateways, load balancing, or reverse proxies. 

 **Desired outcome:** It is critical to plan, build, and operationalize highly available network connectivity for your public endpoints. If your workload becomes unreachable due to a loss in connectivity, even if your workload is running and available, your customers will see your system as down. By combining the highly available and resilient network connectivity for your workload’s public endpoints, along with a resilient architecture for your workload itself, you can provide the best possible availability and service level for your customers. 

 AWS Global Accelerator, Amazon CloudFront, Amazon API Gateway, AWS Lambda Function URLs, AWS AppSync APIs, and Elastic Load Balancing (ELB) all provide highly available public endpoints. Amazon Route 53 provides a highly available DNS service for domain name resolution to verify that your public endpoint addresses can be resolved. 

 You can also evaluate AWS Marketplace software appliances for load balancing and proxying. 

 **Common anti-patterns:** 
+ Designing a highly available workload without planning out DNS and network connectivity for high availability.
+  Using public internet addresses on individual instances or containers and managing the connectivity to them with DNS.
+  Using IP addresses instead of domain names for locating services.
+  Not testing out scenarios where connectivity to your public endpoints is lost. 
+  Not analyzing network throughput needs and distribution patterns. 
+  Not testing and planning for scenarios where internet network connectivity to your public endpoints of your workload might be interrupted. 
+  Providing content (like web pages, static assets, or media files) to a large geographic area and not using a content delivery network. 
+  Not planning for distributed denial of service (DDoS) attacks. DDoS attacks risk shutting out legitimate traffic and lowering availability for your users. 

 **Benefits of establishing this best practice:** Designing for highly available and resilient network connectivity ensures that your workload is accessible and available to your users. 

 **Level of risk exposed if this best practice is not established:** High 

## Implementation guidance
Implementation guidance

 At the core of building highly available network connectivity to your public endpoints is the routing of the traffic. To verify your traffic is able to reach the endpoints, the DNS must be able to resolve the domain names to their corresponding IP addresses. Use a highly available and scalable [Domain Name System (DNS)](https://aws.amazon.com/route53/what-is-dns/) such as Amazon Route 53 to manage your domain’s DNS records. You can also use health checks provided by Amazon Route 53. The health checks verify that your application is reachable, available, and functional, and they can be set up in a way that they mimic your user’s behavior, such as requesting a web page or a specific URL. In case of failure, Amazon Route 53 responds to DNS resolution requests and directs the traffic to only healthy endpoints. You can also consider using Geo DNS and Latency Based Routing capabilities offered by Amazon Route 53. 

 To verify that your workload itself is highly available, use Elastic Load Balancing (ELB). Amazon Route 53 can be used to target traffic to ELB, which distributes the traffic to the target compute instances. You can also use Amazon API Gateway along with AWS Lambda for a serverless solution. Customers can also run workloads in multiple AWS Regions. With [multi-site active/active pattern](https://aws.amazon.com/blogs/architecture/disaster-recovery-dr-architecture-on-aws-part-i-strategies-for-recovery-in-the-cloud/), the workload can serve traffic from multiple Regions. With a multi-site active/passive pattern, the workload serves traffic from the active region while data is replicated to the secondary region and becomes active in the event of a failure in the primary region. Route 53 health checks can then be used to control DNS failover from any endpoint in a primary Region to an endpoint in a secondary Region, verifying that your workload is reachable and available to your users. 

 Amazon CloudFront provides a simple API for distributing content with low latency and high data transfer rates by serving requests using a network of edge locations around the world. Content delivery networks (CDNs) serve customers by serving content located or cached at a location near to the user. This also improves availability of your application as the load for content is shifted away from your servers over to CloudFront’s [edge locations](https://aws.amazon.com/products/networking/edge-networking/). The edge locations and regional edge caches hold cached copies of your content close to your viewers resulting in quick retrieval and increasing reachability and availability of your workload. 

 For workloads with users spread out geographically, AWS Global Accelerator helps you improve the availability and performance of the applications. AWS Global Accelerator provides Anycast static IP addresses that serve as a fixed entry point to your application hosted in one or more AWS Regions. This allows traffic to ingress onto the AWS global network as close to your users as possible, improving reachability and availability of your workload. AWS Global Accelerator also monitors the health of your application endpoints by using TCP, HTTP, and HTTPS health checks. Any changes in the health or configuration of your endpoints permit redirection of user traffic to healthy endpoints that deliver the best performance and availability to your users. In addition, AWS Global Accelerator has a fault-isolating design that uses two static IPv4 addresses that are serviced by independent network zones increasing the availability of your applications. 

 To help protect customers from DDoS attacks, AWS provides AWS Shield Standard. Shield Standard comes automatically turned on and protects from common infrastructure (layer 3 and 4) attacks like SYN/UDP floods and reflection attacks to support high availability of your applications on AWS. For additional protections against more sophisticated and larger attacks (like UDP floods), state exhaustion attacks (like TCP SYN floods), and to help protect your applications running on Amazon Elastic Compute Cloud (Amazon EC2), Elastic Load Balancing (ELB), Amazon CloudFront, AWS Global Accelerator, and Route 53, you can consider using AWS Shield Advanced. For protection against Application layer attacks like HTTP POST or GET floods, use AWS WAF. AWS WAF can use IP addresses, HTTP headers, HTTP body, URI strings, SQL injection, and cross-site scripting conditions to determine if a request should be blocked or allowed. 

 **Implementation steps** 

1.  Set up highly available DNS: Amazon Route 53 is a highly available and scalable [domain name system (DNS)](https://aws.amazon.com/route53/what-is-dns/) web service. Route 53 connects user requests to internet applications running on AWS or on-premises. For more information, see [configuring Amazon Route 53 as your DNS service](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/dns-configuring.html). 

1.  Setup health checks: When using Route 53, verify that only healthy targets are resolvable. Start by [creating Route 53 health checks and configuring DNS failover](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/dns-failover.html). The following aspects are important to consider when setting up health checks: 

   1. [ How Amazon Route 53 determines whether a health check is healthy ](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/dns-failover-determining-health-of-endpoints.html)

   1. [ Creating, updating, and deleting health checks ](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/health-checks-creating-deleting.html)

   1. [ Monitoring health check status and getting notifications ](https://docs.aws.amazon.com/)

   1. [ Best practices for Amazon Route 53 DNS ](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/health-checks-monitor-view-status.html)

1. [ Connect your DNS service to your endpoints. ](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/best-practices-dns.html)

   1.  When using Elastic Load Balancing as a target for your traffic, create an [alias record](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/resource-record-sets-choosing-alias-non-alias.html) using Amazon Route 53 that points to your load balancer’s regional endpoint. During the creation of the alias record, set the Evaluate target health option to Yes. 

   1.  For serverless workloads or private APIs when API Gateway is used, use [Route 53 to direct traffic to API Gateway](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-to-api-gateway.html). 

1.  Decide on a content delivery network. 

   1.  For delivering content using edge locations closer to the user, start by understanding [how CloudFront delivers content](https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/HowCloudFrontWorks.html). 

   1.  Get started with a [simple CloudFront distribution](https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/GettingStarted.SimpleDistribution.html). CloudFront then knows where you want the content to be delivered from, and the details about how to track and manage content delivery. The following aspects are important to understand and consider when setting up CloudFront distribution: 

      1. [ How caching works with CloudFront edge locations ](https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/cache-hit-ratio-explained.html)

      1. [ Increasing the proportion of requests that are served directly from the CloudFront caches (cache hit ratio) ](https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/cache-hit-ratio.html)

      1. [ Using Amazon CloudFront Origin Shield ](https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/origin-shield.html)

      1. [ Optimizing high availability with CloudFront origin failover ](https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/high_availability_origin_failover.html)

1.  Set up application layer protection: AWS WAF helps you protect against common web exploits and bots that can affect availability, compromise security, or consume excessive resources. To get a deeper understanding, review [how AWS WAF works](https://docs.aws.amazon.com/waf/latest/developerguide/how-aws-waf-works.html) and when you are ready to implement protections from application layer HTTP POST AND GET floods, review [Getting started with AWS WAF](https://docs.aws.amazon.com/waf/latest/developerguide/getting-started.html). You can also use AWS WAF with CloudFront see the documentation on [how AWS WAF works with Amazon CloudFront features](https://docs.aws.amazon.com/waf/latest/developerguide/cloudfront-features.html). 

1.  Set up additional DDoS protection: By default, all AWS customers receive protection from common, most frequently occurring network and transport layer DDoS attacks that target your web site or application with AWS Shield Standard at no additional charge. For additional protection of internet-facing applications running on Amazon EC2, Elastic Load Balancing, Amazon CloudFront, AWS Global Accelerator, and Amazon Route 53 you can consider [AWS Shield Advanced](https://docs.aws.amazon.com/waf/latest/developerguide/ddos-advanced-summary.html) and review [examples of DDoS resilient architectures](https://docs.aws.amazon.com/waf/latest/developerguide/ddos-resiliency.html). To protect your workload and your public endpoints from DDoS attacks review [Getting started with AWS Shield Advanced](https://docs.aws.amazon.com/waf/latest/developerguide/getting-started-ddos.html). 

## Resources
Resources

 **Related best practices:** 
+  [REL10-BP01 Deploy the workload to multiple locations](rel_fault_isolation_multiaz_region_system.md) 
+  [REL11-BP04 Rely on the data plane and not the control plane during recovery](rel_withstand_component_failures_avoid_control_plane.md) 
+  [REL11-BP06 Send notifications when events impact availability](rel_withstand_component_failures_notifications_sent_system.md) 

 **Related documents:** 
+  [APN Partner: partners that can help plan your networking](https://aws.amazon.com/partners/find/results/?keyword=network) 
+  [AWS Marketplace for Network Infrastructure](https://aws.amazon.com/marketplace/b/2649366011) 
+  [What Is AWS Global Accelerator?](https://docs.aws.amazon.com/global-accelerator/latest/dg/what-is-global-accelerator.html) 
+  [What is Amazon CloudFront?](https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Introduction.html) 
+  [What is Amazon Route 53?](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/Welcome.html) 
+  [What is Elastic Load Balancing?](https://docs.aws.amazon.com/elasticloadbalancing/latest/userguide/what-is-load-balancing.html) 
+ [ Network Connectivity capability - Establishing Your Cloud Foundations ](https://docs.aws.amazon.com/whitepapers/latest/establishing-your-cloud-foundation-on-aws/network-connectivity-capability.html)
+ [ What is Amazon API Gateway? ](https://docs.aws.amazon.com/apigateway/latest/developerguide/welcome.html)
+ [ What are AWS WAF, AWS Shield, and AWS Firewall Manager? ](https://docs.aws.amazon.com/waf/latest/developerguide/what-is-aws-waf.html)
+ [ What is Amazon Application Recovery Controller? ](https://docs.aws.amazon.com/r53recovery/latest/dg/what-is-route53-recovery.html)
+ [ Configure custom health checks for DNS failover ](https://docs.aws.amazon.com/apigateway/latest/developerguide/dns-failover.html)

 **Related videos:** 
+ [AWS re:Invent 2022 - Improve performance and availability with AWS Global Accelerator](https://www.youtube.com/watch?v=s5sjsdDC0Lg)
+ [AWS re:Invent 2020: Global traffic management with Amazon Route 53 ](https://www.youtube.com/watch?v=E33dA6n9O7I)
+ [AWS re:Invent 2022 - Operating highly available Multi-AZ applications ](https://www.youtube.com/watch?v=mwUV5skJJ0s)
+ [AWS re:Invent 2022 - Dive deep on AWS networking infrastructure ](https://www.youtube.com/watch?v=HJNR_dX8g8c)
+ [AWS re:Invent 2022 - Building resilient networks ](https://www.youtube.com/watch?v=u-qamiNgH7Q)

 **Related examples:** 
+ [ Disaster Recovery with Amazon Application Recovery Controller (ARC) ](https://catalog.us-east-1.prod.workshops.aws/workshops/4d9ab448-5083-4db7-bee8-85b58cd53158/en-US/)
+ [AWS Global Accelerator Workshop ](https://catalog.us-east-1.prod.workshops.aws/workshops/effb1517-b193-4c59-8da5-ce2abdb0b656/en-US)

# REL02-BP02 Provision redundant connectivity between private networks in the cloud and on-premises environments
REL02-BP02 Provision redundant connectivity between private networks in the cloud and on-premises environments

 Implement redundancy in your connections between private networks in the cloud and on-premises environments to achieve connectivity resilience. This can be accomplished by deploying two or more links and traffic paths, preserving connectivity in the event of network failures. 

 **Common anti-patterns:** 
+  You depend on just one network connection, which creates a single point of failure. 
+  You use only one VPN tunnel or multiple tunnels that end in the same Availability Zone. 
+  You rely on one ISP for VPN connectivity, which can lead to complete failures during ISP outages. 
+  Not implementing dynamic routing protocols like BGP, which are crucial for rerouting traffic during network disruptions. 
+  You ignore the bandwidth limitations of VPN tunnels and overestimate their backup capabilities. 

 **Benefits of establishing this best practice:** By implementing redundant connectivity between your cloud environment and your corporate or on-premises environment, the dependent services between the two environments can communicate reliably. 

 **Level of risk exposed if this best practice is not established:** High 

## Implementation guidance
Implementation guidance

 When using AWS Direct Connect to connect your on-premises network to AWS, you can achieve maximum network resiliency (SLA of 99.99%) by using separate connections that end on distinct devices in more than one on-premises location and more than one AWS Direct Connect location. This topology offers resilience against device failures, connectivity issues, and complete location outages. Alternatively, you can achieve high resiliency (SLA of 99.9%) by using two individual connections to multiple locations (each on-premises location connected to a single Direct Connect location). This approach protects against connectivity disruptions caused by fiber cuts or device failures and helps mitigate complete location failures. The Direct Connect Resiliency Toolkit can assist in designing your AWS Direct Connect topology. 

 You can also consider AWS Site-to-Site VPN ending on an AWS Transit Gateway as a cost-effective backup to your primary AWS Direct Connect connection. This setup enables equal-cost multipath (ECMP) routing across multiple VPN tunnels, allowing for throughput of up to 50Gbps, even though each VPN tunnel is capped at 1.25 Gbps. It's important to note, however, that AWS Direct Connect is still the most effective choice for minimizing network disruptions and providing stable connectivity. 

 When using VPNs over the internet to connect your cloud environment to your on-premises data center, configure two VPN tunnels as part of a single site-to-site VPN connection. Each tunnel should end in a different Availability Zone for high availability and use redundant hardware to prevent on-premises device failure. Additionally, consider multiple internet connections from various internet service providers (ISPs) at your on-premises location to avoid complete VPN connectivity disruption due to a single ISP outage. Selecting ISPs with diverse routing and infrastructure, especially those with separate physical paths to AWS endpoints, provides high connectivity availability. 

 In addition to physical redundancy with multiple AWS Direct Connect connections and multiple VPN tunnels (or a combination of both), implementing Border Gateway Protocol (BGP) dynamic routing is also crucial. Dynamic BGP provides automatic rerouting of traffic from one path to another based on real-time network conditions and configured policies. This dynamic behavior is especially beneficial in maintaining network availability and service continuity in the event of link or network failures. It quickly selects alternative paths, enhancing the network's resilience and reliability. 

### Implementation steps
Implementation steps
+  Acquisition highly-available connectivity between AWS and your on-premises environment. 
  +  Use multiple AWS Direct Connect connections or VPN tunnels between separately deployed private networks. 
  +  Use multiple Direct Connect locations for high availability. 
  +  If using multiple AWS Regions, create redundancy in at least two of them. 
+  Use AWS Transit Gateway, when possible, to end your [VPN connection](https://docs.aws.amazon.com/vpc/latest/tgw/tgw-vpn-attachments.html). 
+  Evaluate AWS Marketplace appliances to end VPNs or [extend your SD-WAN to AWS](https://docs.aws.amazon.com/whitepapers/latest/aws-vpc-connectivity-options/aws-transit-gateway-sd-wan.html). If you use AWS Marketplace appliances, deploy redundant instances for high availability in different Availability Zones. 
+  Provide a redundant connection to your on-premises environment. 
  +  You may need redundant connections to multiple AWS Regions to achieve your availability needs. 
  +  Use the [Direct Connect Resiliency Toolkit](https://docs.aws.amazon.com/directconnect/latest/UserGuide/resilency_toolkit.html) to get started. 

## Resources
Resources

 **Related documents:** 
+  [AWS Direct Connect Resiliency Recommendations](https://aws.amazon.com/directconnect/resiliency-recommendation/) 
+  [Using Redundant Site-to-Site VPN Connections to Provide Failover](https://docs.aws.amazon.com/vpn/latest/s2svpn/VPNConnections.html) 
+  [Routing policies and BGP communities](https://docs.aws.amazon.com/directconnect/latest/UserGuide/routing-and-bgp.html) 
+  [Active/Active and Active/Passive Configurations in AWS Direct Connect](https://docs.aws.amazon.com/architecture-diagrams/latest/active-active-and-active-passive-configurations-in-aws-direct-connect/active-active-and-active-passive-configurations-in-aws-direct-connect.html) 
+  [APN Partner: partners that can help plan your networking](https://aws.amazon.com/partners/find/results/?keyword=network) 
+  [AWS Marketplace for Network Infrastructure](https://aws.amazon.com/marketplace/b/2649366011) 
+  [Amazon Virtual Private Cloud Connectivity Options Whitepaper](https://docs.aws.amazon.com/whitepapers/latest/aws-vpc-connectivity-options/introduction.html) 
+  [Building a Scalable and Secure Multi-VPC AWS Network Infrastructure](https://docs.aws.amazon.com/whitepapers/latest/building-scalable-secure-multi-vpc-network-infrastructure/welcome.html) 
+  [Using redundant Site-to-Site VPN connections to provide failover](https://docs.aws.amazon.com/vpn/latest/s2svpn/VPNConnections.html) 
+  [Using the Direct Connect Resiliency Toolkit to get started](https://docs.aws.amazon.com/directconnect/latest/UserGuide/resilency_toolkit.html) 
+  [VPC Endpoints and VPC Endpoint Services (AWS PrivateLink)](https://docs.aws.amazon.com/vpc/latest/userguide/endpoint-services-overview.html) 
+  [What Is Amazon VPC?](https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html) 
+  [What is a transit gateway?](https://docs.aws.amazon.com/vpc/latest/tgw/what-is-transit-gateway.html) 
+  [What is AWS Site-to-Site VPN?](https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_VPN.html) 
+  [Working with Direct Connect gateways](https://docs.aws.amazon.com/directconnect/latest/UserGuide/direct-connect-gateways.html) 

 **Related videos:** 
+  [AWS re:Invent 2018: Advanced VPC Design and New Capabilities for Amazon VPC ](https://youtu.be/fnxXNZdf6ew) 
+  [AWS re:Invent 2019: AWS Transit Gateway reference architectures for many VPCs](https://youtu.be/9Nikqn_02Oc) 

# REL02-BP03 Ensure IP subnet allocation accounts for expansion and availability
REL02-BP03 Ensure IP subnet allocation accounts for expansion and availability

 Amazon VPC IP address ranges must be large enough to accommodate workload requirements, including factoring in future expansion and allocation of IP addresses to subnets across Availability Zones. This includes load balancers, EC2 instances, and container-based applications. 

 When you plan your network topology, the first step is to define the IP address space itself. Private IP address ranges (following RFC 1918 guidelines) should be allocated for each VPC. Accommodate the following requirements as part of this process: 
+ Allow IP address space for more than one VPC per Region. 
+ Within a VPC, allow space for multiple subnets so that you can cover multiple Availability Zones. 
+ Consider leaving unused CIDR block space within a VPC for future expansion.
+ Ensure that there is IP address space to meet the needs of any transient fleets of Amazon EC2 instances that you might use, such as Spot Fleets for machine learning, Amazon EMR clusters, or Amazon Redshift clusters. Similar consideration should be given to Kubernetes clusters, such as Amazon Elastic Kubernetes Service (Amazon EKS), as each Kubernetes pod is assigned a routable address from the VPC CIDR block by default.
+ Note that the first four IP addresses and the last IP address in each subnet CIDR block are reserved and not available for your use.
+ Note that the initial VPC CIDR block allocated to your VPC cannot be changed or deleted, but you can add additional non-overlapping CIDR blocks to the VPC. Subnet IPv4 CIDRs cannot be changed, however IPv6 CIDRs can.
+ The largest possible VPC CIDR block is a /16, and the smallest is a /28.
+ Consider other connected networks (VPC, on-premises, or other cloud providers) and ensure non-overlapping IP address space. For more information, see [REL02-BP05 Enforce non-overlapping private IP address ranges in all private address spaces where they are connected.](https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/rel_planning_network_topology_non_overlap_ip.html)

 **Desired outcome:** A scalable IP subnet can help you accomodate for future growth and avoid unnecessary waste.

 **Common anti-patterns:** 
+ Failing to consider future growth, resulting in CIDR blocks that are too small and requiring reconfiguration, potentially causing downtime.
+ Incorrectly estimating how many IP addresses an elastic load balancer can use.
+ Deploying many high traffic load balancers into the same subnets
+ Using automated scaling mechanisms whilst failing to monitor IP address consumption.
+ Defining excessively large CIDR ranges well beyond future growth expectations, which can lead to difficulty peering with other networks with overlapping address ranges.

 **Benefits of establishing this best practice:** This ensures that you can accommodate the growth of your workloads and continue to provide availability as you scale up. 

 **Level of risk exposed if this best practice is not established:** Medium 

## Implementation guidance
Implementation guidance

Plan your network to accommodate for growth, regulatory compliance, and integration with others. Growth can be underestimated, regulatory compliance can change, and acquisitions or private network connections can be difficult to implement without proper planning. 
+  Select relevant AWS accounts and Regions based on your service requirements, latency, regulatory, and disaster recovery (DR) requirements. 
+  Identify your needs for regional VPC deployments. 
+  Identify the size of the VPCs. 
  +  Determine if you are going to deploy multi-VPC connectivity. 
    +  [What Is a Transit Gateway?](https://docs.aws.amazon.com/vpc/latest/tgw/what-is-transit-gateway.html) 
    +  [Single Region Multi-VPC Connectivity](https://aws.amazon.com/answers/networking/aws-single-region-multi-vpc-connectivity/) 
  +  Determine if you need segregated networking for regulatory requirements. 
  + Make VPCs with appropriately-sized CIDR blocks to accommodate your current and future needs.
    + If you have unknown growth projections, you may wish to err on the side of larger CIDR blocks to reduce the potential for future reconfiguration
  + Consider using [IPv6 addressing](https://aws.amazon.com/vpc/ipv6/) for subnets as part of a dual-stack VPC. IPv6 is well suited to being used in private subnets containing fleets of ephemeral instances or containers that would otherwise require large numbers of IPv4 addresses.

## Resources
Resources

 **Related Well-Architected best practices:** 
+  [REL02-BP05 Enforce non-overlapping private IP address ranges in all private address spaces where they are connected](https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/rel_planning_network_topology_non_overlap_ip.html) 

 **Related documents:** 
+  [APN Partner: partners that can help plan your networking](https://aws.amazon.com/partners/find/results/?keyword=network) 
+  [AWS Marketplace for Network Infrastructure](https://aws.amazon.com/marketplace/b/2649366011) 
+  [Amazon Virtual Private Cloud Connectivity Options Whitepaper](https://docs.aws.amazon.com/whitepapers/latest/aws-vpc-connectivity-options/introduction.html) 
+  [Multiple data center HA network connectivity](https://aws.amazon.com/answers/networking/aws-multiple-data-center-ha-network-connectivity/) 
+  [Single Region Multi-VPC Connectivity](https://aws.amazon.com/answers/networking/aws-single-region-multi-vpc-connectivity/) 
+  [What Is Amazon VPC?](https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html) 
+  [IPv6 on AWS](https://aws.amazon.com/vpc/ipv6) 
+  [IPv6 on reference architectures](https://d1.awsstatic.com/architecture-diagrams/ArchitectureDiagrams/IPv6-reference-architectures-for-AWS-and-hybrid-networks-ra.pdf) 
+  [Amazon Elastic Kubernetes Service launches IPv6 support](https://aws.amazon.com/blogs/containers/amazon-eks-launches-ipv6-support/) 
+ [ Recommendations for your VPC - Classic Load Balancers ](https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/elb-backend-instances.html#set-up-ec2)
+ [ Availability Zone subnets - Application Load Balancers ](https://docs.aws.amazon.com/elasticloadbalancing/latest/application/application-load-balancers.html#availability-zones)
+ [ Availability Zones - Network Load Balancers ](https://docs.aws.amazon.com/elasticloadbalancing/latest/network/network-load-balancers.html#availability-zones)

 **Related videos:** 
+  [AWS re:Invent 2018: Advanced VPC Design and New Capabilities for Amazon VPC (NET303)](https://youtu.be/fnxXNZdf6ew) 
+  [AWS re:Invent 2019: AWS Transit Gateway reference architectures for many VPCs (NET406-R1)](https://youtu.be/9Nikqn_02Oc) 
+  [AWS re:Invent 2023: AWS Ready for what's next? Designing networks for growth and flexibility (NET310)](https://www.youtube.com/watch?v=FkWOhTZSfdA) 

# REL02-BP04 Prefer hub-and-spoke topologies over many-to-many mesh
REL02-BP04 Prefer hub-and-spoke topologies over many-to-many mesh

 When connecting multiple private networks, such as Virtual Private Clouds (VPCs) and on-premises networks, opt for a hub-and-spoke topology over a meshed one. Unlike meshed topologies, where each network connects directly to the others and increases the complexity and management overhead, the hub-and-spoke architecture centralizes connections through a single hub. This centralization simplifies the network structure and enhances its operability, scalability, and control. 

 AWS Transit Gateway is a managed, scalable, and highly-available service designed for construction of hub-and-spoke networks on AWS. It serves as the central hub of your network that provides network segmentation, centralized routing, and the simplified connection to both cloud and on-premises environments. The following figure illustrates how you can use AWS Transit Gateway to build your hub-and-spoke topology. 

![\[AWS Transit Gateway connecting various services like VPCs, Direct Connect, and third-party appliances.\]](http://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/images/hub-and-spoke.png)


 **Desired outcome:** You have connected your Virtual Private Clouds (VPCs) and on-premises networks through a central hub. You configure your peering connections through the hub, which acts as a highly scalable cloud router. Routing is simplified because you do not have to work with complex peering relationships. Traffic between networks is encrypted, and you have the ability to isolate networks. 

 **Common anti-patterns:** 
+  You build complex network peering rules. 
+  You provide routes between networks that should not communicate with one another (for example, separate workloads that have no interdependencies). 
+  There is ineffective governance of the hub instance. 

 **Benefits of establishing this best practice:** As the number of connected networks increases, management and expansion of meshed connectivity becomes increasingly challenging. A mesh architecture introduces additional challenges, such as additional infrastructure components, configuration requirements, and deployment considerations. The mesh also introduces additional overhead to manage and monitor the data plane and control plane components. You must think about how to provide high availability of the mesh architecture, how to monitor the mesh health and performance, and how to handle upgrades of the mesh components. 

 A hub-and-spoke model, on the other hand, establishes centralized traffic routing across multiple networks. It provides a simpler approach to management and monitoring of the data plane and control plane components. 

 **Level of risk exposed if this best practice is not established:** Medium 

## Implementation guidance
Implementation guidance

 Create a Network Services account if one does not exist. Place the hub in the organization's Network Services account. This approach allows the hub to be centrally managed by network engineers. 

 The hub of the hub-and-spoke model acts as a virtual router for traffic flowing between your Virtual Private Clouds (VPCs) and on-premises networks. This approach reduces network complexity and makes it easier to troubleshoot networking issues. 

 Consider your network design, including the VPCs, AWS Direct Connect, and Site-to-Site VPN connections you want to interconnect. 

 Consider using a separate subnet for each transit gateway VPC attachment. For each subnet, use a small CIDR (for example /28) so that you have more address space for compute resources. Additionally, create one network ACL, and associate it with all of the subnets that are associated with the hub. Keep the network ACL open in both the inbound and outbound directions. 

 Design and implement your routing tables such that routes are provided only between networks that should communicate. Omit routes between networks that should not communicate with one another (for example, between separate workloads that have no inter-dependencies). 

### Implementation steps
Implementation steps

1.  Plan your network. Determine which networks you want to connect, and verify that they don't share overlapping CIDR ranges. 

1.  Create an AWS Transit Gateway and attach your VPCs. 

1.  If needed, create VPN connections or Direct Connect gateways, and associate them with the Transit Gateway. 

1.  Define how traffic is routed between the connected VPCs and other connections through configuration of your Transit Gateway route tables. 

1.  Use Amazon CloudWatch to monitor and adjust configurations as necessary for performance and cost optimization. 

## Resources
Resources

 **Related best practices:** 
+  [REL02-BP03 Ensure IP subnet allocation accounts for expansion and availability](https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/rel_planning_network_topology_ip_subnet_allocation.html) 
+  [REL02-BP05 Enforce non-overlapping private IP address ranges in all private address spaces where they are connected](https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/rel_planning_network_topology_non_overlap_ip.html) 

 **Related documents:** 
+  [What Is a Transit Gateway?](https://docs.aws.amazon.com/vpc/latest/tgw/what-is-transit-gateway.html) 
+  [Transit gateway design best practices](https://docs.aws.amazon.com/vpc/latest/tgw/tgw-best-design-practices.html) 
+  [Building a Scalable and Secure Multi-VPC AWS Network Infrastructure](https://docs.aws.amazon.com/whitepapers/latest/building-scalable-secure-multi-vpc-network-infrastructure/welcome.html) 
+  [Building a global network using AWS Transit Gateway Inter-Region peering](https://aws.amazon.com/blogs/networking-and-content-delivery/building-a-global-network-using-aws-transit-gateway-inter-region-peering/) 
+  [Amazon Virtual Private Cloud Connectivity Options](https://docs.aws.amazon.com/whitepapers/latest/aws-vpc-connectivity-options/introduction.html) 
+  [APN Partner: partners that can help plan your networking](https://aws.amazon.com/partners/find/results/?keyword=network) 
+  [AWS Marketplace for Network Infrastructure](https://aws.amazon.com/marketplace/b/2649366011) 

 **Related videos:** 
+  [AWS re:Invent 2023 - AWS networking foundations](https://www.youtube.com/watch?v=8nNurTFy-h4) 
+  [AWS re:Invent 2023 - Advanced VPC designs and new capabilities](https://www.youtube.com/watch?v=cRdDCkbE4es) 

 **Related workshops:** 
+  [AWS Transit Gateway Workshop](https://catalog.workshops.aws/trasitgw/en-US) 

# REL02-BP05 Enforce non-overlapping private IP address ranges in all private address spaces where they are connected
REL02-BP05 Enforce non-overlapping private IP address ranges in all private address spaces where they are connected

The IP address ranges of each of your VPCs must not overlap when peered, connected via Transit Gateway, or connected over VPN. Avoid IP address conflicts between a VPC and on-premises environments or with other cloud providers that you use. You must also have a way to allocate private IP address ranges when needed. An IP address management (IPAM) system can help with automating this. 

 **Desired outcome:** 
+  No IP address range conflicts between VPCs, on-premises environments, or other cloud providers. 
+  Proper IP address management allows for easier scaling of network infrastructure to accommodate growth and changes in network requirements. 

 **Common anti-patterns:** 
+  Using the same IP range in your VPC as you have on premises, in your corporate network, or other cloud providers 
+  Not tracking IP ranges of VPCs used to deploy your workloads. 
+  Relying on manual IP address management processes, such as spreadsheets. 
+  Over- or under-sizing CIDR blocks, which results in IP address waste or insufficient address space for your workload. 

 **Benefits of establishing this best practice:** Active planning of your network will ensure that you do not have multiple occurrences of the same IP address in interconnected networks. This prevents routing problems from occurring in parts of the workload that are using the different applications. 

 **Level of risk exposed if this best practice is not established:** Medium 

## Implementation guidance
Implementation guidance

 Make use of an IPAM, such as the [Amazon VPC IP Address Manager](https://docs.aws.amazon.com/vpc/latest/ipam/what-it-is-ipam.html), to monitor and manage your CIDR use. Several IPAMs are also available from the AWS Marketplace. Evaluate your potential usage on AWS, add CIDR ranges to existing VPCs, and create VPCs to allow planned growth in usage. 

### Implementation steps
Implementation steps
+  Capture current CIDR consumption (for example, VPCs and subnets). 
  +  Use service API operations to collect current CIDR consumption. 
  +  Use the [Amazon VPC IP Address Manager to discover resources](https://docs.aws.amazon.com/vpc/latest/ipam/res-disc-work-with-view.html). 
+  Capture your current subnet usage. 
  +  Use service API operations to [collect subnets](https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeSubnets.html) per VPC in each Region. 
  +  Use the [Amazon VPC IP Address Manager to discover resources](https://docs.aws.amazon.com/vpc/latest/ipam/res-disc-work-with-view.html). 
+  Record the current usage. 
+  Determine if you created any overlapping IP ranges. 
+  Calculate the spare capacity. 
+  Identify overlapping IP ranges. You can either migrate to a new range of addresses or consider using techniques like [private NAT Gateway](https://docs.aws.amazon.com/whitepapers/latest/building-scalable-secure-multi-vpc-network-infrastructure/private-nat-gateway.html) or [AWS PrivateLink](https://docs.aws.amazon.com/whitepapers/latest/building-scalable-secure-multi-vpc-network-infrastructure/aws-privatelink.html) if you need to connect the overlapping ranges. 

## Resources
Resources

 **Related best practices:** 
+ [ Protecting networks ](https://docs.aws.amazon.com/wellarchitected/latest/security-pillar/protecting-networks.html)

 **Related documents:** 
+  [APN Partner: partners that can help plan your networking](https://aws.amazon.com/partners/find/results/?keyword=network) 
+  [AWS Marketplace for Network Infrastructure](https://aws.amazon.com/marketplace/b/2649366011) 
+  [Amazon Virtual Private Cloud Connectivity Options Whitepaper](https://docs.aws.amazon.com/whitepapers/latest/aws-vpc-connectivity-options/introduction.html) 
+  [Multiple data center HA network connectivity](https://aws.amazon.com/answers/networking/aws-multiple-data-center-ha-network-connectivity/) 
+ [ Connecting Networks with Overlapping IP Ranges ](https://aws.amazon.com/blogs/networking-and-content-delivery/connecting-networks-with-overlapping-ip-ranges/)
+  [What Is Amazon VPC?](https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html) 
+  [What is IPAM?](https://docs.aws.amazon.com/vpc/latest/ipam/what-it-is-ipam.html) 

 **Related videos:** 
+ [AWS re:Invent 2023 - Advanced VPC designs and new capabilities ](https://www.youtube.com/watch?v=cRdDCkbE4es)
+  [AWS re:Invent 2019: AWS Transit Gateway reference architectures for many VPCs](https://youtu.be/9Nikqn_02Oc) 
+ [AWS re:Invent 2023 - Ready for what’s next? Designing networks for growth and flexibility ](https://www.youtube.com/watch?v=FkWOhTZSfdA)
+ [AWS re:Invent 2021 - \$1New Launch\$1 Manage your IP addresses at scale on AWS](https://www.youtube.com/watch?v=xtLJgJfhPLg)