Operational excellence Security Reliability Performance efficiency Cost optimization Sustainability Summary

Appendix A: Best practice reference

This appendix provides a complete reference of all best practices defined in the Agentic AI Lens, organized by pillar and focus area.

Operational excellence

AGENTOPS01: How do you establish operational practices for agentic AI systems?

ID	Best practice	Risk
AGENTOPS01-BP01	Establish well-defined agent roles, responsibilities, and success criteria	High
AGENTOPS01-BP02	Design multi-agent handoff procedures with human-in-the-loop escalation	High
AGENTOPS01-BP03	Develop test scenarios that accurately capture failures of dependent components, orchestration protocols, and business processes	High

AGENTOPS02: How do you manage prompt and configuration lifecycle?

ID	Best practice	Risk
AGENTOPS02-BP01	Evolve agent prompts, tool calls, and configurations to reflect evolving business needs	High
AGENTOPS02-BP02	Implement configuration drift detection and remediation	High
AGENTOPS02-BP03	Implement agent behavior versioning and rollback capabilities	High
AGENTOPS02-BP04	Maintain feedback control loops for continuous improvement	Medium

AGENTOPS03: How do you manage agent lifecycle and deployment processes?

ID	Best practice	Risk
AGENTOPS03-BP01	Define an agent lifecycle with clear SME ownership, testing, and governance	High
AGENTOPS03-BP02	Implement CI/CD pipelines tailored to agentic system deployment (AgentOps)	High
AGENTOPS03-BP03	Implement agent-specific scaling policies and capacity planning	Medium
AGENTOPS03-BP04	Implement organizational agent portfolio management and governance at scale	High

AGENTOPS04: How do you establish tool integration and management practices?

ID	Best practice	Risk
AGENTOPS04-BP01	Implement tool registry and catalog management	Medium
AGENTOPS04-BP02	Establish standardized tool integration protocols (MCP, A2A)	High
AGENTOPS04-BP03	Develop fallback behavior and error handling for tool invocations	Medium

AGENTOPS05: How do you implement comprehensive observability and monitoring for agentic systems?

ID	Best practice	Risk
AGENTOPS05-BP01	Establish end-to-end tracing and telemetry for agent operations	High
AGENTOPS05-BP02	Monitor agent behavior patterns and detect anomalies	High
AGENTOPS05-BP03	Implement structured logging and comprehensive audit trails	High
AGENTOPS05-BP04	Define and track KPIs for agent workflows	Medium
AGENTOPS05-BP05	Create workflow-specific dashboards for operational health	Medium

AGENTOPS06: How do you implement testing, evaluation, and validation frameworks?

ID	Best practice	Risk
AGENTOPS06-BP01	Design multi-layered testing frameworks	High
AGENTOPS06-BP02	Evaluate and track ongoing agent performance	High
AGENTOPS06-BP03	Establish SME-driven validation and business approval workflows	High

AGENTOPS07: How do you establish operational recovery and consumption monitoring?

ID	Best practice	Risk
AGENTOPS07-BP01	Implement automated response and recovery mechanisms	High
AGENTOPS07-BP02	Establish operational knowledge management systems	Medium
AGENTOPS07-BP03	Augment change management to accommodate technical improvements and business requirements	Medium
AGENTOPS07-BP04	Implement break-glass operational runbooks	High

Security

AGENTSEC01: How do you secure agentic memory and securely manage state between agents?

ID	Best practice	Risk
AGENTSEC01-BP01	Implement memory isolation and integrity controls	High
AGENTSEC01-BP02	Validate and sanitize memory inputs	High
AGENTSEC01-BP03	Monitor for hallucination propagation	Medium

AGENTSEC02: How do you control and secure agent tool usage?

ID	Best practice	Risk
AGENTSEC02-BP01	Implement tool authorization	High
AGENTSEC02-BP02	Validate tool inputs and outputs	High
AGENTSEC02-BP03	Maintain approved tool registry with security assessments	Medium

AGENTSEC03: How do you manage agent identities, permissions, and prevent privilege escalation?

ID	Best practice	Risk
AGENTSEC03-BP01	Implement strong authentication for agent identities	High
AGENTSEC03-BP02	Separate agent and human user permission	High
AGENTSEC03-BP03	Implement least privilege with dynamic boundaries	High
AGENTSEC03-BP04	Regular permission audits and access reviews	Medium

AGENTSEC04: How do you support agent goal alignment and prevent manipulation?

ID	Best practice	Risk
AGENTSEC04-BP01	Implement guardrails and alignment controls	High
AGENTSEC04-BP02	Human-in-the-loop for critical decisions	High

AGENTSEC05: How do you implement observability and prevent repudiation?

ID	Best practice	Risk
AGENTSEC05-BP01	Implement comprehensive logging and decision artifact storage	High
AGENTSEC05-BP02	Implement distributed tracing for agent interactions	Medium

AGENTSEC06: How do you secure multi-agent orchestration and coordination?

ID	Best practice	Risk
AGENTSEC06-BP01	Encrypt and sign inter-agent messages	High
AGENTSEC06-BP02	Implement workflow orchestration security controls	High
AGENTSEC06-BP03	Establish trust boundaries between agents	High
AGENTSEC06-BP04	Monitor and detect coordination anomalies	Medium

AGENTSEC07: How do you protect human oversight from manipulation and detect rogue agents?

ID	Best practice	Risk
AGENTSEC07-BP01	Implement cognitive load management	Medium
AGENTSEC07-BP02	Clear confidence indicators and manipulation warnings	Medium
AGENTSEC07-BP03	Multiple reviewers for critical operations	Medium
AGENTSEC07-BP04	Behavioral anomaly detection and agent containment	High
AGENTSEC07-BP05	Regular security assessments and red teaming	Medium

AGENTSEC08: How do you validate and secure agent inputs and outputs?

ID	Best practice	Risk
AGENTSEC08-BP01	Multi-layer input validation and prompt injection defense	High
AGENTSEC08-BP02	Output filtering for sensitive information	High

AGENTSEC09: How do you perform vulnerability scanning and penetration testing for agentic AI systems?

ID	Best practice	Risk
AGENTSEC09-BP01	Integrate AI-powered vulnerability scanning across the development lifecycle	High
AGENTSEC09-BP02	Conduct context-aware penetration testing with multi-agent attack simulation	High
AGENTSEC09-BP03	Implement continuous security validation with automated remediation	Medium
AGENTSEC09-BP04	Establish scoped and controlled testing environments for agent security assessments	Medium
AGENTSEC09-BP05	Implement runtime threat detection, security event correlation, and automated remediation for agents	High

Reliability

AGENTREL01: How do I develop reliable agentic systems?

ID	Best practice	Risk
AGENTREL01-BP01	Implement a resilient messaging layer	High
AGENTREL01-BP02	Establish modular, fault-isolated layers	High
AGENTREL01-BP03	Design specialized agents following actor model principles	Medium
AGENTREL01-BP04	Standardize communication protocols	Medium
AGENTREL01-BP05	Implement adaptive provisioning	Medium

AGENTREL02: How do you develop agentic systems that reliably execute tasks with predictable outcomes?

ID	Best practice	Risk
AGENTREL02-BP01	Design agents for specific and atomic tasks	Medium
AGENTREL02-BP02	Limit agent permissions to minimum required access	High
AGENTREL02-BP03	Implement behavioral anomaly detection and monitoring	High
AGENTREL02-BP04	Develop clear instruction protocols for agents	Medium
AGENTREL02-BP05	Establish tiered human oversight and approval workflows	High

AGENTREL03: How do you support agent memory and state remaining reliably accessible throughout the agent lifecycle?

ID	Best practice	Risk
AGENTREL03-BP01	Design an information classification model to identify short-term and long-term memories	Medium
AGENTREL03-BP02	Architect fault-tolerant memory stores with redundancy and failover	High
AGENTREL03-BP03	Implement comprehensive state management and checkpoint-based recovery	High
AGENTREL03-BP04	Implement graceful degradation for memory and state operations	Medium

AGENTREL04: How do you orchestrate multi-agent systems to reliably execute tasks?

ID	Best practice	Risk
AGENTREL04-BP01	Implement the arbiter agent pattern for coordinated multi-agent systems	High
AGENTREL04-BP02	Classify agents with a comprehensive capability taxonomy	Medium
AGENTREL04-BP03	Implement fallback mechanisms and graceful degradation for collaborative workflows	High
AGENTREL04-BP04	Implement resilient control planes for agent coordination	High

AGENTREL05: How do you implement reliable agent cognition that accesses the right data at the right time?

ID	Best practice	Risk
AGENTREL05-BP01	Design modular, fault-tolerant agentic reasoning components	Medium
AGENTREL05-BP02	Facilitate reliable adaptation through evaluation-driven improvement cycles	Medium
AGENTREL05-BP03	Ground agent cognition in real information	High

AGENTREL06: How do agents integrate effectively with existing systems without impacting the reliability of established processes?

ID	Best practice	Risk
AGENTREL06-BP01	Develop agent-based integrations with existing or legacy systems	Medium
AGENTREL06-BP02	Establish fallback mechanisms for legacy system degradation	High
AGENTREL06-BP03	Regularly test degraded system performance	Medium
AGENTREL06-BP04	Implement idempotent task execution patterns	High
AGENTREL06-BP05	Implement dynamic capability toggling	Medium

AGENTREL07: How do fault tolerant agent systems recover?

ID	Best practice	Risk
AGENTREL07-BP01	Design workflows in stages with incremental recovery	High
AGENTREL07-BP02	Enable automatic recovery from agent execution failures	High
AGENTREL07-BP03	Implement distributed tracing to track system dependencies and facilitate recovery	High

AGENTREL08: How do agents determine when and where graceful degradation is appropriate?

ID	Best practice	Risk
AGENTREL08-BP01	Establish consistent configuration management practices	Medium
AGENTREL08-BP02	Implement agent tracing for telemetry throughout agent processing	High
AGENTREL08-BP03	Architect agent systems with resource isolation and contention mitigation	High
AGENTREL08-BP04	Track agent memory utilization metrics	Medium

Performance efficiency

AGENTPERF01: How do you plan strategically for agent performance and establish measurement practices?

ID	Best practice	Risk
AGENTPERF01-BP01	Define performance-aligned success criteria for agent workloads	High
AGENTPERF01-BP02	Implement comprehensive performance telemetry	High
AGENTPERF01-BP03	Profile end-to-end agent latency and identify optimization targets	High

AGENTPERF02: How do you optimize core agent processing and cognitive pipelines?

ID	Best practice	Risk
AGENTPERF02-BP01	Design efficient reasoning pipelines	High
AGENTPERF02-BP02	Implement task-appropriate model selection strategies	High
AGENTPERF02-BP03	Optimize agent execution paths for reduced latency	High
AGENTPERF02-BP04	Optimize streaming responses and time-to-first-token for agent interactions	High

AGENTPERF03: How do you optimize memory management, context windows, and retrieval-augmented generation?

ID	Best practice	Risk
AGENTPERF03-BP01	Implement tiered memory management systems	High
AGENTPERF03-BP02	Optimize context window utilization and prompt management	High
AGENTPERF03-BP03	Optimize RAG retrieval pipelines for latency and precision	High
AGENTPERF03-BP04	Establish efficient agent caching and data access patterns	Medium
AGENTPERF03-BP05	Implement agentic retrieval patterns for dynamic, agent-driven knowledge access	High

AGENTPERF04: How do you achieve efficient communication and protocol usage across agent interactions?

ID	Best practice	Risk
AGENTPERF04-BP01	Optimize asynchronous message handling patterns	High
AGENTPERF04-BP02	Implement efficient protocol-based agent communications	Medium
AGENTPERF04-BP03	Design high-performing event-driven integration patterns	Medium

AGENTPERF05: How do you optimize workflow orchestration and multi-agent collaboration for performance?

ID	Best practice	Risk
AGENTPERF05-BP01	Design efficient workflow orchestration patterns	High
AGENTPERF05-BP02	Implement optimized multi-agent collaboration models	High
AGENTPERF05-BP03	Optimize multi-stage AI pipeline execution	Medium
AGENTPERF05-BP04	Implement efficient agent delegation and handoff patterns	Medium

AGENTPERF06: How do you optimize tool integrations and framework usage for agent performance?

ID	Best practice	Risk
AGENTPERF06-BP01	Design optimized tool integration strategies	Medium
AGENTPERF06-BP02	Implement efficient tool invocation patterns	Medium
AGENTPERF06-BP03	Optimize meta-tool utilization and tool chaining	Low

AGENTPERF07: How do you manage multitenant performance isolation and optimize resource utilization?

ID	Best practice	Risk
AGENTPERF07-BP01	Design efficient multitenant agent deployment models	High
AGENTPERF07-BP02	Implement tenant-aware performance isolation and throttling	High

Cost optimization

AGENTCOST01: How do you optimize agent reasoning and execution costs?

ID	Best practice	Risk
AGENTCOST01-BP01	Use the reflection pattern to design efficient agent reasoning loops	Medium
AGENTCOST01-BP02	Optimize multi-agent collaboration cost through efficient handoff patterns	High
AGENTCOST01-BP03	Implement cost-effective patterns like hybrid supervisor for multi-agent coordination	Medium
AGENTCOST01-BP04	Design agent hierarchies and delegation patterns that reduce coordination overhead	Medium

AGENTCOST02: How do you optimize agent model invocation and token consumption costs?

ID	Best practice	Risk
AGENTCOST02-BP01	Architect tiered model selection for cost-performance optimization	High
AGENTCOST02-BP02	Cost optimize token consumption through efficient prompt engineering	Medium
AGENTCOST02-BP03	Use intelligent caching to reduce redundant model invocations	High
AGENTCOST02-BP04	Implement model customization for long-term cost reduction	High

AGENTCOST03: How do you manage agent memory and state costs efficiently?

ID	Best practice	Risk
AGENTCOST03-BP01	Design cost-effective retrieval systems with tiered memory	Medium
AGENTCOST03-BP02	Cost optimize through intelligent compression and pruning of context windows	High
AGENTCOST03-BP03	Implement cost-optimized state persistence and lifecycle management	Medium

AGENTCOST04: How do you optimize agent tool invocation?

ID	Best practice	Risk
AGENTCOST04-BP01	Design cost effective tool selection to minimize unnecessary invocations	High
AGENTCOST04-BP02	Cost optimize tool serving through serverless and resource sharing	High
AGENTCOST04-BP03	Implement intelligent caching and failure handling for tool results	High

AGENTCOST05: How do you implement cost attribution?

ID	Best practice	Risk
AGENTCOST05-BP01	Establish agent-level reasoning cost tracking and attribution	Medium
AGENTCOST05-BP02	Implement distributed cost tracing for multi-agent workflows	High
AGENTCOST05-BP03	Design tenant-aware cost allocation for AaaS pricing models	High
AGENTCOST05-BP04	Create chargeback and ROI reporting	Medium

AGENTCOST06: How do you optimize agent discovery registry and deployment costs?

ID	Best practice	Risk
AGENTCOST06-BP01	Implement lightweight discovery and registry for cost-effective collaboration	Medium
AGENTCOST06-BP02	Cost optimize versioning and deployment through efficient artifact management	Medium
AGENTCOST06-BP03	Design cost-efficient initialization through warm pools and caching	High

AGENTCOST07: How do you establish agent cost governance and continuous optimization?

ID	Best practice	Risk
AGENTCOST07-BP01	Implement automated cost controls with intelligent cutoffs	High
AGENTCOST07-BP02	Establish proactive anomaly detection for agent cost patterns	High
AGENTCOST07-BP03	Create systematic optimization feedback loops for continuous improvement	High

Sustainability

AGENTSUS01: How do you build sustainable and repeatable frameworks for managing compute, memory, and other shareable agent resources?

ID	Best practice	Risk
AGENTSUS01-BP01	Design specialized agents with explicit resource boundaries	High
AGENTSUS01-BP02	Implement reusable workflow patterns	Medium
AGENTSUS01-BP03	Optimize resource utilization through shared services	Medium
AGENTSUS01-BP04	Scale cognitive processing pathways appropriately	High
AGENTSUS01-BP05	Adopt specification-driven tasks for frontier agents and long-running workflows	High

AGENTSUS02: How do I establish sustainable frameworks for agent dependencies?

ID	Best practice	Risk
AGENTSUS02-BP01	Optimize context management and memory utilization	Medium
AGENTSUS02-BP02	Establish efficient agent caching strategies	Medium
AGENTSUS02-BP03	Appropriately scale data, networking, and compute dependencies	Medium
AGENTSUS02-BP04	Measure and optimize the environmental footprint of agent workloads	Medium

AGENTSUS03: How do I establish durable patterns for agent interactions with users and business processes?

ID	Best practice	Risk
AGENTSUS03-BP01	Maintain organizational skills and competencies	High
AGENTSUS03-BP02	Build agents to mirror your organizational skills and competencies	Medium
AGENTSUS03-BP03	Maintain comprehensive specifications for agents and agentic systems	High
AGENTSUS03-BP04	Decommission unused agents and prevent agent sprawl	Medium

Summary

Best practice summary by pillar
Pillar	Questions	Best practices	High Risk	Medium Risk	Low Risk
Operational Excellence	7	26	18	8	0
Security	9	30	19	11	0
Reliability	8	33	18	15	0
Performance Efficiency	7	24	16	7	1
Cost Optimization	7	24	14	10	0
Sustainability	3	13	5	8	0
Total	41	150	90	59	1

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Contributors

Document revisions