

# AGENTREL06-BP02 Establish fallback mechanisms for legacy system degradation
<a name="agentrel06-bp02"></a>

 Legacy systems carry lower availability SLAs than cloud-based services, and their failure modes are often unpredictable. Health-aware fallback paths, caches for reference data, queues for transactions, graceful degradation for real-time, keep agent workflows running through legacy outages. 

 **Desired outcome:** 
+  You have health monitoring on every legacy integration with alarms that trigger fallback activation. 
+  You have cache-based fallbacks for reference data and queue-based fallbacks for transactional operations. 
+  You recover automatically when legacy systems return, with periodic probes deactivating the cutoff. 

 **Common anti-patterns:** 
+  Assuming legacy systems match cloud-based reliability, without implementing fallbacks for their actual SLAs. 
+  Deploying fallbacks that silently return stale or incorrect data without informing agents or users. 
+  Skipping legacy health monitoring, so outages become visible only when agent tasks fail. 

 **Benefits of establishing this best practice:** 
+  Agent functionality stays available during legacy outages through pre-defined fallback paths. 
+  Proactive fallback activation through health monitoring replaces reactive failure detection. 
+  Users and downstream systems see transparent indications of degraded capability. 

 **Level of risk exposed if this best practice is not established:** High 

## Implementation guidance
<a name="implementation-guidance"></a>

 Health monitoring is the prerequisite for any automatic fallback. Without a health signal, the only way to know the legacy system is down is to watch user-visible failures accumulate. Periodic probes through [Amazon Bedrock AgentCore Observability](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/observability.html) and Amazon CloudWatch check endpoint availability and response time. Alarms on those probes trigger fallback activation before the first user-facing failure happens. 

 Fallback shape depends on the operation type. For reference data, product catalogs, configuration values, mostly-static lookups, cache-based fallbacks serve recently retrieved data during outages. The accuracy cost is low because the data doesn't change often. For transactional operations, queue-based fallbacks buffer requests for replay when the system recovers, preserving the intent of each operation without attempting it against an unreachable system. For real-time data that can't be cached or queued, live prices, current inventory, instantaneous system state, graceful degradation means informing the user the information is temporarily unavailable rather than returning a plausible-but-wrong answer. 

 Automatic cutoffs need to unwind themselves, or every outage turns into a permanent downgrade. Use CloudWatch alarms to detect when error rates cross a threshold and trigger automated responses, updating [Amazon Bedrock AgentCore Policy](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/policy.html) Cedar policies to deny tool access or activating Lambda-based circuit breaker logic. Configure periodic probes that test availability and re-enable access when the system recovers. Monitor fallback activation frequency through AgentCore Observability to identify legacy systems causing disproportionate reliability issues. Those systems are the candidates for modernization investment. 

### Implementation steps
<a name="implementation-steps"></a>

1.  **Implement health monitoring for each legacy integration:** Use [Amazon Bedrock AgentCore Observability](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/observability.html) and Amazon CloudWatch probes to check endpoint availability and response time. 

1.  **Configure alarms that trigger fallback activation:** Alarm on health degradation so fallbacks activate before user-visible failures accumulate. 

1.  **Implement operation-appropriate fallbacks:** Cache-based fallbacks for reference data, queue-based fallbacks for transactional operations, and graceful degradation messages for real-time data. 

1.  **Deploy automatic cutoffs with recovery detection:** Use CloudWatch alarms to trigger [Amazon Bedrock AgentCore Policy](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/policy.html) updates or circuit breaker logic, and run periodic probes to re-enable access when the system recovers. 

1.  **Monitor fallback activation frequency:** Use AgentCore Observability to identify legacy systems that consistently degrade, so modernization effort can be prioritized. 

## Resources
<a name="resources"></a>

 **Related best practices:** 
+  [AGENTREL04-BP03 Implement fallback mechanisms and graceful degradation for collaborative workflows](agentrel04-bp03.html) 
+  [AGENTREL06-BP01 Develop agent-based integrations with existing or legacy systems](agentrel06-bp01.html) 
+  [AGENTREL06-BP03 Regularly test degraded system performance](agentrel06-bp03.html) 

 **Related documents:** 
+  [Amazon Bedrock AgentCore Observability](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/observability.html) 
+  [Amazon Bedrock AgentCore Policy](https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/policy.html) 
+  [Build resilient generative AI agents](https://aws.amazon.com/blogs/architecture/build-resilient-generative-ai-agents) 

 **Related services:** 
+  [Amazon Bedrock AgentCore](https://aws.amazon.com/bedrock/agentcore/) 
+  [Amazon CloudWatch](https://aws.amazon.com/cloudwatch/) 