

 This whitepaper is for historical reference only. Some content might be outdated and some links might not be available.

# Core isolation concepts
<a name="core-isolation-concepts"></a>

 Part of the challenge of isolation is that there are multiple definitions of tenant isolation. For some, isolation is almost a business construct where they think about entire customers requiring their own environments. For others, isolation is more of an architectural construct that overlays the services and constructs of your multi-tenant environment. The sections below will explore the different types of isolation, and associate specific terminology with the varying isolation constructs. 

**Topics**
+ [Silo isolation](silo-isolation.md)
+ [Pool isolation](pool-isolation.md)
+ [The bridge model](the-bridge-model.md)
+ [Tier-based isolation](tier-based-isolation.md)
+ [Identity and isolation](identity-and-isolation.md)

# Silo isolation
<a name="silo-isolation"></a>

 While SaaS providers are often focused on the value of sharing resources, there are still scenarios where a SaaS provider may choose to have some (or all) of their tenants deployed in a model where each tenant is running a fully siloed stack of resources. Some would say that this full-stack model does not represent a SaaS environment. However, if you’ve surrounded these separate stacks with shared identity, onboarding, metering, metrics, deployment, analytics, and operations, then we’d still say this is still a valid variant of SaaS that trades economies of scale and operational efficiency for compliance, business, or domain considerations. With this approach, isolation is an end-to-end construct that spans an entire customer stack. The diagram in Figure 1 provides a conceptual view of this view of isolation. 

![\[Diagram showing a full stack view of isolation with three tenents.\]](http://docs.aws.amazon.com/whitepapers/latest/saas-tenant-isolation-strategies/images/full-stack-isolation.jpg)


This diagram highlights the basic footprint of the siloed deployment model. The technologies that are used run these stacks are mostly irrelevant here. This could be a monolith, it could be serverless, or it could be any mix of the various application architecture models. The key concept here is that we’re going to take whatever stack the tenant has and surround it with some construct to encapsulate all the moving parts of that stack. This becomes our boundary for isolation. As long as you can prevent a tenant from escaping their fully encapsulated environment, you’ve achieved the isolation. 

 Generally, this model of isolation is a much simpler to enforce. There are often well-defined constructs that will enable you to implement a robust isolation model. While this model presents some real challenges to the cost and agility goals of a SaaS environment, it can be appealing to those that have very strict isolation requirements. 

## Silo model pros and cons
<a name="silo-model-pros-and-cons"></a>

 Each SaaS environment and business domain has its own unique set of requirements that may make silo a fit. However, if you’re leaning in this direction, you’ll definitely want to factor in some of the challenges and overhead associated with the silo model. Below is a list of some of the pros and cons that you need to consider if you are exploring a silo model for your SaaS solution: 

### Pros
<a name="pros"></a>
+  **Supporting challenging compliance models** – Some SaaS providers are selling into regulated environments that impose strict isolation requirements. The silo provides these ISVs with an option that enables them to offer to some or all of their tenants the option of being deployed in a dedicated model. 
+  **No noisy neighbor concerns** – While all SaaS providers should be attempting to limit the impacts of noisy neighbor conditions, some customers will still express reservations about the potential of having their workloads impacted by the activity of other tenants using the system. Silo addresses this concern by offering a dedicated environment with no potential of noisy neighbor scenarios. 
+  **Tenant cost tracking** – SaaS providers are often highly focused on understanding how each tenant is impacting their infrastructure costs. Calculating a cost per tenant can be challenging in some SaaS models. However, the coarse-grained nature of the silo model provides us with a simple way to capture and associate infrastructure costs with each tenant. 
+  **Limited blast radius** – The silo model generally reduces your exposure when there may be some outage or event that surfaces in your SaaS solution. Since each SaaS provider is running in its own environment, any failures that occur within a given tenant’s environment will likely be constrained to that environment. While one tenant may experience an outage, the error may not cascade through the remaining tenants that are using your system. 

### Cons
<a name="cons"></a>
+  **Scaling issues** – There are limits on the number of accounts that can be provisioned. This limit may exclude you from selecting the account-based model. There are also general concerns about how a rapidly growing number of accounts might undermine the management and operational experience of your SaaS environment. If you have 20 siloed accounts for each of your tenants, for example, that may be manageable. However, if you have a thousand tenants, that would likely begin to impact operational efficiency and agility. 
+  **Cost** – With every tenant running in its own environment, we’re missing much of the cost efficiency that is traditionally associated with SaaS solutions. Even if these environments scale dynamically, you’ll likely have periods of the day when you’ll have idle resources that are going un-consumed. While this is a completely acceptable model, it undermines the ability of your organization to achieve the economies of scale and margin benefits that are essential to the SaaS model. 
+  **Agility** – The move to SaaS is often directly motivated by a desire to innovate at a faster pace. This means adopting a model that enables the organization to respond and react to market dynamics at a rapid pace. A key part of this is being able to unify the customer experience and quickly deploy new features and capability. While there are measures that can be taken with the silo model to try to limit its impact on agility, the highly decentralized nature of the silo model adds complexity that impacts your ability to easily manage, operate, and support your tenants. 
+  **Onboarding automation** – SaaS environments place a premium on automating the introduction of new tenants. Whether these tenants are being onboarded in a self-service model or using an internally managed provisioning process, you’ll still need automated onboarding. And, when you have separate silos for each tenant, this often becomes a much more heavyweight process. The provisioning of a new tenant will require the provisioning of new infrastructure and, potentially, the configuration of new account limits. These added moving parts introduce overhead that introduces additional dimensions of complexity into the overall onboarding automation, enabling you to focus less time on your customers. 
+  **Decentralized management and monitoring** – Our goal with SaaS is to have a single pane of glass that lets us manage and monitor all tenant activity. This requirement is especially important when you have siloed tenant environments. The challenge here is that you must now aggregate the data from a more decentralized tenant footprint. While there are mechanisms that will enable you to create an aggregate view of your tenants, the effort and energy needed to build and manage this experience is more complex in a siloed model. 

# Pool isolation
<a name="pool-isolation"></a>

 It’s pretty easy to see how the silo model of isolation maps very nicely for many SaaS companies. At the same, many companies that are moving to SaaS are seeking out the efficiency, agility, and cost benefits of being able to have their tenants share some or all of their underlying infrastructure. This shared infrastructure approach, which is referred to as a pool model, adds a level of complexity to the isolation story. The diagram in Figure 2 provides an illustration of the challenge associated with implementing isolation in a pooled model. 

![\[Diagram showing pooled isolation.\]](http://docs.aws.amazon.com/whitepapers/latest/saas-tenant-isolation-strategies/images/pooled-isolation.jpg)


In this model, you’ll notice that our tenants are consuming infrastructure that is shared by all tenants. This enables the resources to scale in direct proportion to the actual load being imposed by the tenants. To the right of the diagram, we’ve zoomed into the compute of one of the services, highlighting the fact that tenants 1-N may all be running side-by-side within your shared compute at any given time. You’ll also notice that the storage in this example is also shared. Here we’ve represented a table that is indexed by individual tenant identifiers. 

 Now, while this model is a perfectly good fit for SaaS providers, you can see how this complicates the overall isolation story. With resources being shared, it’s unclear what it would mean here to implement isolation. We can’t lean on the typical networking and IAM constructs to create boundaries between tenants. 

 The key here is that—even though this is a more challenging environment to isolation—you cannot use this as a rationale to relax the isolation requirements of your environment. If anything, these shared model increases the chance for cross-tenant access and, as such, it represents an area that requires you to be especially diligent about ensuring that resources are isolated. 

 As we dig deeper into the pool isolation model, you’ll see how this architectural footprint introduces a unique blend of challenges—each of which requires its own type of isolation constructs to successfully isolate a tenant’s resources. 

## Pool model pros and cons
<a name="pool-model-pros-and-cons"></a>

 While having everything shared enables a lot of efficiency and optimization, it also requires SaaS providers to weigh some of the tradeoffs that come with adopting this model. In many cases, the pros and cons of the pool model end up surfacing as the inverse of pros and cons we covered for the silo model. The following is an outline of the key pros and cons that are typically associated with the pool isolation model. 

### Pros
<a name="pros-1"></a>
+  **Agility** – As you move all tenants into a shared infrastructure model, you get all the natural efficiencies and simplicity that streamlines the agility of your SaaS offering. At its core, the pool model is all about enabling SaaS providers to manage, scale, and operate all of its tenants with one unified experience. Centralizing and standardizing the experience is foundational to enabling SaaS providers to easily manage and apply changes to all tenants without having to perform one-off tasks on a tenant-by-tenant basis. This operational efficiency is key to the overall agility footprint of your SaaS environment. 
+  **Cost efficiency** – Many companies are drawn to SaaS for its cost efficiency. A big part of this cost efficiency is commonly associated with the pool model of isolation. In a pooled environment, your system will scale based on the actual load and activity of all of your tenants. If all the tenants are offline, your infrastructure costs should be minimal. The key concept here is that pooled environments can adjust to tenant load dynamically and enable you to better align tenant activity with resource consumption. 
+  **Simplified management and operations** – The pool model of isolation gives me one view into all the tenants in my system. I can manage, update, and deploy all of my tenant through a single experience that touches all the tenants in my system. This makes most aspects of the management and operations footprint simpler. 
+  **Innovation** – The agility that is enabled by the pooled isolation model also tends to be core to enabling SaaS providers to innovate at a faster pace. The more you move away from distributed management and the complexity of the silo model, the more you’re freed up to focus on the features and functions of your product. 

### Cons
<a name="cons-1"></a>
+  **Noisy neighbor** – The more resources are shared, the more chances there are for one tenant to impact the experience of another. Any activity from one tenant that puts heavy load on the system, for example, has the potential to impact other tenants. A good multi-tenant architecture and design will try to limit these impacts, but there’s always some chance of a noisy neighbor condition impact one or more of your tenants in a pooled isolation model. 
+  **Tenant cost tracking** – In a silo model, it’s much easier to attribute consumption of a resource to a specific tenant. However, in a pooled model, the attribution of resources consumption becomes more challenging. This pushes more work to each SaaS provider as they look for ways to instrument their systems and surface the granular data needed to effectively associate consumption with individual tenants. 
+  **Blast radius** – Having all of your resources shared also introduces some operational risk. In the silo model, when one tenant had a failure, the impact of that failure could likely be limited to that one tenant. However, in a pooled environment, an outage will likely impact all the tenants of your system. This can have a significant impact on the business. This usually requires an even deeper commitment to building a resilient environment that can identify, surface, and gracefully recover from failures. 
+  **Compliance pushback** – While there are measures you can take to isolate your tenants in a pool model, the notion of sharing infrastructure can create situations where customers may be unwilling to run in this model. This is especially true in environments where the compliance or regulatory rules for a domain impose strict constraints on the accessibility and isolation of resources. Even in these cases, though, this may mean some portion of the system will need to be siloed (see the bridge model below). 

# The bridge model
<a name="the-bridge-model"></a>

 While silo and pool have very distinct approaches to isolation, the isolation landscape for many SaaS providers is less absolute. As you look at real application problems and you decompose our systems into smaller services, you will often discover that your solution will require a mix of the silo and pool models. This mixed model is what we would refer to as a bridge model of isolation. The diagram in Figure 3 provides an example of how the bridge might be realized in a SaaS solution. 

![\[Diagram showing the bridge isolation model with three tenants.\]](http://docs.aws.amazon.com/whitepapers/latest/saas-tenant-isolation-strategies/images/bridge-isolation.jpg)


 This diagram highlights how the bridge model enables you to combine of the silo and pool models. Here we have a monolithic architecture with classic web and application tiers. The web tier, for this solution, is deployed in a pool model that is shared by all tenants. While the web tier is shared, the underlying business logic and storage of our application are actually deployed in a silo model where each tenant has its own application tier and storage. 

 Now, imagine we were to break this monolith into microservices. You can imagine that each of the various microservices in our system could leverage combinations of the silo and pool models. We’ll dig into that more as we get into the specifics of applying silo and pool with different AWS constructs. The key takeaway here is that your view of silo and pool will be much more granular for environments that are decomposed into a collection of services that have varying isolation requirements. 

## Bridge model pros and cons
<a name="bridge-model-pros-and-cons"></a>

 The bridge model is more a hybrid model that focuses on enabling you to apply the silo or pool model where it makes sense. The idea here is that the values and tenets of silo isolation still apply to each of these areas of the system. As you think about pros and cons of the bridge model, then, you should be thinking about the tradeoffs of silo and pool models for each resource or layer of your architecture. 

# Tier-based isolation
<a name="tier-based-isolation"></a>

 While most of our discussion of isolation focuses on the mechanics of preventing cross-tenant access, there are also scenarios where the tiering of your offering might influence your isolation strategy. In this case, it’s less about how you’re isolating tenants and more about how you might package and offer different flavors of isolation to different tenants with different profiles. Still, this is another consideration that could determine which models of isolation you’ll need to support to address the full spectrum of customers you want to engage. The diagram in Figure 4 provides an example of how isolation might vary across tiers. 

![\[Diagram showing tenant tiering and isolation with multiple tenants.\]](http://docs.aws.amazon.com/whitepapers/latest/saas-tenant-isolation-strategies/images/tenant-tiering-and-isolation.png)


Here you’ll see a scenario where we a mix of silo and pool isolation models that have been offered up as tiers to our tenants. Tenants in the silver tier are running in the pooled environment. While these tenants are running in a shared infrastructure model, they still fully expect that their resources will be protected from any cross-tenant access. The tenant on the right has required you to offer them a completely dedicated (silo) environment. To support this, the SaaS provider has created a premium tier model that enables tenants to run in this dedicated model at what we would assume would be a substantially higher price point. 

 While SaaS providers generally try to limit offering a silo model to their customers, many SaaS businesses have this notion of a private pricing where these tenants offer to pay a premium to be deployed in this model. In fact, SaaS companies will not publish this as an option or identify it as a tier to limit the number of customers that chose this option. If too many of your tenants fall into this model, you’ll begin to fall back to a fully siloed model and inherit many of the challenges that we outlined above. 

 To limit the impact of these one-off environments, SaaS providers will often require these premium customers to run the same version of the product is deployed to the pooled environment. This enables the ISV to continue to manage and operate both environments through a single pane of glass. Essentially, the silo environment becomes a clone of the pooled environment that happens to be supporting one tenant. 

# Identity and isolation
<a name="identity-and-isolation"></a>

 While the scope of your discussion is limited to isolation, it’s important to look at how identity connects to the isolation model. The reality is, if you are planning to isolate tenants, you must have some way to represent and identify the tenant that is currently accessing the resources of our SaaS environment. In many cases, identity will be used in combination with other constructs to acquire the policies and scoping rules that are at the core of an isolation scheme. How these policies are defined and applied will vary for each of the isolation models and services you’re consuming. Still, the basics of the approach usually follow a pattern similar to what is shown in Figure 5. 

![\[Diagram showing the connection between identity and isolation.\]](http://docs.aws.amazon.com/whitepapers/latest/saas-tenant-isolation-strategies/images/connecting-identity-and-isolation.png)


 This diagram represents a generalization of how identity gets connected to the broader isolation story. Here you’ll notice that, as a user is authenticated, the system will return tenant context back to your application that includes the user’s binding to a tenant as well as the policies that will be used to enforce isolation for that tenant. This context then flows through all of our interactions and is used by the downstream elements of the SaaS environment to scope access to resource (in this case a database). 

 How that scope is acquired and applied will vary based on the isolation model and resources you’re consuming, but this model provides a view of the core concepts. One key area of variation is in how the tenant scoping is determined. This scoping context could be attached to a service when it is deployed or it could be acquired at run-time. We’ll look at both of those models as we get into the specific isolation traits for different architecture models. 