View a markdown version of this page

Appendix A: Best practice reference - Agentic AI Lens

Appendix A: Best practice reference

This appendix provides a complete reference of all best practices defined in the Agentic AI Lens, organized by pillar and focus area.

Operational excellence

AGENTOPS01: How do you establish operational practices for agentic AI systems?

ID Best practice Risk
AGENTOPS01-BP01 Establish well-defined agent roles, responsibilities, and success criteria High
AGENTOPS01-BP02 Design multi-agent handoff procedures with human-in-the-loop escalation High
AGENTOPS01-BP03 Develop test scenarios that accurately capture failures of dependent components, orchestration protocols, and business processes High

AGENTOPS02: How do you manage prompt and configuration lifecycle?

ID Best practice Risk
AGENTOPS02-BP01 Evolve agent prompts, tool calls, and configurations to reflect evolving business needs High
AGENTOPS02-BP02 Implement configuration drift detection and remediation High
AGENTOPS02-BP03 Implement agent behavior versioning and rollback capabilities High
AGENTOPS02-BP04 Maintain feedback control loops for continuous improvement Medium

AGENTOPS03: How do you manage agent lifecycle and deployment processes?

ID Best practice Risk
AGENTOPS03-BP01 Define an agent lifecycle with clear SME ownership, testing, and governance High
AGENTOPS03-BP02 Implement CI/CD pipelines tailored to agentic system deployment (AgentOps) High
AGENTOPS03-BP03 Implement agent-specific scaling policies and capacity planning Medium
AGENTOPS03-BP04 Implement organizational agent portfolio management and governance at scale High

AGENTOPS04: How do you establish tool integration and management practices?

ID Best practice Risk
AGENTOPS04-BP01 Implement tool registry and catalog management Medium
AGENTOPS04-BP02 Establish standardized tool integration protocols (MCP, A2A) High
AGENTOPS04-BP03 Develop fallback behavior and error handling for tool invocations Medium

AGENTOPS05: How do you implement comprehensive observability and monitoring for agentic systems?

ID Best practice Risk
AGENTOPS05-BP01 Establish end-to-end tracing and telemetry for agent operations High
AGENTOPS05-BP02 Monitor agent behavior patterns and detect anomalies High
AGENTOPS05-BP03 Implement structured logging and comprehensive audit trails High
AGENTOPS05-BP04 Define and track KPIs for agent workflows Medium
AGENTOPS05-BP05 Create workflow-specific dashboards for operational health Medium

AGENTOPS06: How do you implement testing, evaluation, and validation frameworks?

ID Best practice Risk
AGENTOPS06-BP01 Design multi-layered testing frameworks High
AGENTOPS06-BP02 Evaluate and track ongoing agent performance High
AGENTOPS06-BP03 Establish SME-driven validation and business approval workflows High

AGENTOPS07: How do you establish operational recovery and consumption monitoring?

ID Best practice Risk
AGENTOPS07-BP01 Implement automated response and recovery mechanisms High
AGENTOPS07-BP02 Establish operational knowledge management systems Medium
AGENTOPS07-BP03 Augment change management to accommodate technical improvements and business requirements Medium
AGENTOPS07-BP04 Implement break-glass operational runbooks High

Security

AGENTSEC01: How do you secure agentic memory and securely manage state between agents?

ID Best practice Risk
AGENTSEC01-BP01 Implement memory isolation and integrity controls High
AGENTSEC01-BP02 Validate and sanitize memory inputs High
AGENTSEC01-BP03 Monitor for hallucination propagation Medium

AGENTSEC02: How do you control and secure agent tool usage?

ID Best practice Risk
AGENTSEC02-BP01 Implement tool authorization High
AGENTSEC02-BP02 Validate tool inputs and outputs High
AGENTSEC02-BP03 Maintain approved tool registry with security assessments Medium

AGENTSEC03: How do you manage agent identities, permissions, and prevent privilege escalation?

ID Best practice Risk
AGENTSEC03-BP01 Implement strong authentication for agent identities High
AGENTSEC03-BP02 Separate agent and human user permission High
AGENTSEC03-BP03 Implement least privilege with dynamic boundaries High
AGENTSEC03-BP04 Regular permission audits and access reviews Medium

AGENTSEC04: How do you support agent goal alignment and prevent manipulation?

ID Best practice Risk
AGENTSEC04-BP01 Implement guardrails and alignment controls High
AGENTSEC04-BP02 Human-in-the-loop for critical decisions High

AGENTSEC05: How do you implement observability and prevent repudiation?

ID Best practice Risk
AGENTSEC05-BP01 Implement comprehensive logging and decision artifact storage High
AGENTSEC05-BP02 Implement distributed tracing for agent interactions Medium

AGENTSEC06: How do you secure multi-agent orchestration and coordination?

ID Best practice Risk
AGENTSEC06-BP01 Encrypt and sign inter-agent messages High
AGENTSEC06-BP02 Implement workflow orchestration security controls High
AGENTSEC06-BP03 Establish trust boundaries between agents High
AGENTSEC06-BP04 Monitor and detect coordination anomalies Medium

AGENTSEC07: How do you protect human oversight from manipulation and detect rogue agents?

ID Best practice Risk
AGENTSEC07-BP01 Implement cognitive load management Medium
AGENTSEC07-BP02 Clear confidence indicators and manipulation warnings Medium
AGENTSEC07-BP03 Multiple reviewers for critical operations Medium
AGENTSEC07-BP04 Behavioral anomaly detection and agent containment High
AGENTSEC07-BP05 Regular security assessments and red teaming Medium

AGENTSEC08: How do you validate and secure agent inputs and outputs?

ID Best practice Risk
AGENTSEC08-BP01 Multi-layer input validation and prompt injection defense High
AGENTSEC08-BP02 Output filtering for sensitive information High

AGENTSEC09: How do you perform vulnerability scanning and penetration testing for agentic AI systems?

ID Best practice Risk
AGENTSEC09-BP01 Integrate AI-powered vulnerability scanning across the development lifecycle High
AGENTSEC09-BP02 Conduct context-aware penetration testing with multi-agent attack simulation High
AGENTSEC09-BP03 Implement continuous security validation with automated remediation Medium
AGENTSEC09-BP04 Establish scoped and controlled testing environments for agent security assessments Medium
AGENTSEC09-BP05 Implement runtime threat detection, security event correlation, and automated remediation for agents High

Reliability

AGENTREL01: How do I develop reliable agentic systems?

ID Best practice Risk
AGENTREL01-BP01 Implement a resilient messaging layer High
AGENTREL01-BP02 Establish modular, fault-isolated layers High
AGENTREL01-BP03 Design specialized agents following actor model principles Medium
AGENTREL01-BP04 Standardize communication protocols Medium
AGENTREL01-BP05 Implement adaptive provisioning Medium

AGENTREL02: How do you develop agentic systems that reliably execute tasks with predictable outcomes?

ID Best practice Risk
AGENTREL02-BP01 Design agents for specific and atomic tasks Medium
AGENTREL02-BP02 Limit agent permissions to minimum required access High
AGENTREL02-BP03 Implement behavioral anomaly detection and monitoring High
AGENTREL02-BP04 Develop clear instruction protocols for agents Medium
AGENTREL02-BP05 Establish tiered human oversight and approval workflows High

AGENTREL03: How do you support agent memory and state remaining reliably accessible throughout the agent lifecycle?

ID Best practice Risk
AGENTREL03-BP01 Design an information classification model to identify short-term and long-term memories Medium
AGENTREL03-BP02 Architect fault-tolerant memory stores with redundancy and failover High
AGENTREL03-BP03 Implement comprehensive state management and checkpoint-based recovery High
AGENTREL03-BP04 Implement graceful degradation for memory and state operations Medium

AGENTREL04: How do you orchestrate multi-agent systems to reliably execute tasks?

ID Best practice Risk
AGENTREL04-BP01 Implement the arbiter agent pattern for coordinated multi-agent systems High
AGENTREL04-BP02 Classify agents with a comprehensive capability taxonomy Medium
AGENTREL04-BP03 Implement fallback mechanisms and graceful degradation for collaborative workflows High
AGENTREL04-BP04 Implement resilient control planes for agent coordination High

AGENTREL05: How do you implement reliable agent cognition that accesses the right data at the right time?

ID Best practice Risk
AGENTREL05-BP01 Design modular, fault-tolerant agentic reasoning components Medium
AGENTREL05-BP02 Facilitate reliable adaptation through evaluation-driven improvement cycles Medium
AGENTREL05-BP03 Ground agent cognition in real information High

AGENTREL06: How do agents integrate effectively with existing systems without impacting the reliability of established processes?

ID Best practice Risk
AGENTREL06-BP01 Develop agent-based integrations with existing or legacy systems Medium
AGENTREL06-BP02 Establish fallback mechanisms for legacy system degradation High
AGENTREL06-BP03 Regularly test degraded system performance Medium
AGENTREL06-BP04 Implement idempotent task execution patterns High
AGENTREL06-BP05 Implement dynamic capability toggling Medium

AGENTREL07: How do fault tolerant agent systems recover?

ID Best practice Risk
AGENTREL07-BP01 Design workflows in stages with incremental recovery High
AGENTREL07-BP02 Enable automatic recovery from agent execution failures High
AGENTREL07-BP03 Implement distributed tracing to track system dependencies and facilitate recovery High

AGENTREL08: How do agents determine when and where graceful degradation is appropriate?

ID Best practice Risk
AGENTREL08-BP01 Establish consistent configuration management practices Medium
AGENTREL08-BP02 Implement agent tracing for telemetry throughout agent processing High
AGENTREL08-BP03 Architect agent systems with resource isolation and contention mitigation High
AGENTREL08-BP04 Track agent memory utilization metrics Medium

Performance efficiency

AGENTPERF01: How do you plan strategically for agent performance and establish measurement practices?

ID Best practice Risk
AGENTPERF01-BP01 Define performance-aligned success criteria for agent workloads High
AGENTPERF01-BP02 Implement comprehensive performance telemetry High
AGENTPERF01-BP03 Profile end-to-end agent latency and identify optimization targets High

AGENTPERF02: How do you optimize core agent processing and cognitive pipelines?

ID Best practice Risk
AGENTPERF02-BP01 Design efficient reasoning pipelines High
AGENTPERF02-BP02 Implement task-appropriate model selection strategies High
AGENTPERF02-BP03 Optimize agent execution paths for reduced latency High
AGENTPERF02-BP04 Optimize streaming responses and time-to-first-token for agent interactions High

AGENTPERF03: How do you optimize memory management, context windows, and retrieval-augmented generation?

ID Best practice Risk
AGENTPERF03-BP01 Implement tiered memory management systems High
AGENTPERF03-BP02 Optimize context window utilization and prompt management High
AGENTPERF03-BP03 Optimize RAG retrieval pipelines for latency and precision High
AGENTPERF03-BP04 Establish efficient agent caching and data access patterns Medium
AGENTPERF03-BP05 Implement agentic retrieval patterns for dynamic, agent-driven knowledge access High

AGENTPERF04: How do you achieve efficient communication and protocol usage across agent interactions?

ID Best practice Risk
AGENTPERF04-BP01 Optimize asynchronous message handling patterns High
AGENTPERF04-BP02 Implement efficient protocol-based agent communications Medium
AGENTPERF04-BP03 Design high-performing event-driven integration patterns Medium

AGENTPERF05: How do you optimize workflow orchestration and multi-agent collaboration for performance?

ID Best practice Risk
AGENTPERF05-BP01 Design efficient workflow orchestration patterns High
AGENTPERF05-BP02 Implement optimized multi-agent collaboration models High
AGENTPERF05-BP03 Optimize multi-stage AI pipeline execution Medium
AGENTPERF05-BP04 Implement efficient agent delegation and handoff patterns Medium

AGENTPERF06: How do you optimize tool integrations and framework usage for agent performance?

ID Best practice Risk
AGENTPERF06-BP01 Design optimized tool integration strategies Medium
AGENTPERF06-BP02 Implement efficient tool invocation patterns Medium
AGENTPERF06-BP03 Optimize meta-tool utilization and tool chaining Low

AGENTPERF07: How do you manage multitenant performance isolation and optimize resource utilization?

ID Best practice Risk
AGENTPERF07-BP01 Design efficient multitenant agent deployment models High
AGENTPERF07-BP02 Implement tenant-aware performance isolation and throttling High

Cost optimization

AGENTCOST01: How do you optimize agent reasoning and execution costs?

ID Best practice Risk
AGENTCOST01-BP01 Use the reflection pattern to design efficient agent reasoning loops Medium
AGENTCOST01-BP02 Optimize multi-agent collaboration cost through efficient handoff patterns High
AGENTCOST01-BP03 Implement cost-effective patterns like hybrid supervisor for multi-agent coordination Medium
AGENTCOST01-BP04 Design agent hierarchies and delegation patterns that reduce coordination overhead Medium

AGENTCOST02: How do you optimize agent model invocation and token consumption costs?

ID Best practice Risk
AGENTCOST02-BP01 Architect tiered model selection for cost-performance optimization High
AGENTCOST02-BP02 Cost optimize token consumption through efficient prompt engineering Medium
AGENTCOST02-BP03 Use intelligent caching to reduce redundant model invocations High
AGENTCOST02-BP04 Implement model customization for long-term cost reduction High

AGENTCOST03: How do you manage agent memory and state costs efficiently?

ID Best practice Risk
AGENTCOST03-BP01 Design cost-effective retrieval systems with tiered memory Medium
AGENTCOST03-BP02 Cost optimize through intelligent compression and pruning of context windows High
AGENTCOST03-BP03 Implement cost-optimized state persistence and lifecycle management Medium

AGENTCOST04: How do you optimize agent tool invocation?

ID Best practice Risk
AGENTCOST04-BP01 Design cost effective tool selection to minimize unnecessary invocations High
AGENTCOST04-BP02 Cost optimize tool serving through serverless and resource sharing High
AGENTCOST04-BP03 Implement intelligent caching and failure handling for tool results High

AGENTCOST05: How do you implement cost attribution?

ID Best practice Risk
AGENTCOST05-BP01 Establish agent-level reasoning cost tracking and attribution Medium
AGENTCOST05-BP02 Implement distributed cost tracing for multi-agent workflows High
AGENTCOST05-BP03 Design tenant-aware cost allocation for AaaS pricing models High
AGENTCOST05-BP04 Create chargeback and ROI reporting Medium

AGENTCOST06: How do you optimize agent discovery registry and deployment costs?

ID Best practice Risk
AGENTCOST06-BP01 Implement lightweight discovery and registry for cost-effective collaboration Medium
AGENTCOST06-BP02 Cost optimize versioning and deployment through efficient artifact management Medium
AGENTCOST06-BP03 Design cost-efficient initialization through warm pools and caching High

AGENTCOST07: How do you establish agent cost governance and continuous optimization?

ID Best practice Risk
AGENTCOST07-BP01 Implement automated cost controls with intelligent cutoffs High
AGENTCOST07-BP02 Establish proactive anomaly detection for agent cost patterns High
AGENTCOST07-BP03 Create systematic optimization feedback loops for continuous improvement High

Sustainability

AGENTSUS01: How do you build sustainable and repeatable frameworks for managing compute, memory, and other shareable agent resources?

ID Best practice Risk
AGENTSUS01-BP01 Design specialized agents with explicit resource boundaries High
AGENTSUS01-BP02 Implement reusable workflow patterns Medium
AGENTSUS01-BP03 Optimize resource utilization through shared services Medium
AGENTSUS01-BP04 Scale cognitive processing pathways appropriately High
AGENTSUS01-BP05 Adopt specification-driven tasks for frontier agents and long-running workflows High

AGENTSUS02: How do I establish sustainable frameworks for agent dependencies?

ID Best practice Risk
AGENTSUS02-BP01 Optimize context management and memory utilization Medium
AGENTSUS02-BP02 Establish efficient agent caching strategies Medium
AGENTSUS02-BP03 Appropriately scale data, networking, and compute dependencies Medium
AGENTSUS02-BP04 Measure and optimize the environmental footprint of agent workloads Medium

AGENTSUS03: How do I establish durable patterns for agent interactions with users and business processes?

ID Best practice Risk
AGENTSUS03-BP01 Maintain organizational skills and competencies High
AGENTSUS03-BP02 Build agents to mirror your organizational skills and competencies Medium
AGENTSUS03-BP03 Maintain comprehensive specifications for agents and agentic systems High
AGENTSUS03-BP04 Decommission unused agents and prevent agent sprawl Medium

Summary

Best practice summary by pillar
Pillar Questions Best practices High Risk Medium Risk Low Risk
Operational Excellence 7 26 18 8 0
Security 9 30 19 11 0
Reliability 8 33 18 15 0
Performance Efficiency 7 24 16 7 1
Cost Optimization 7 24 14 10 0
Sustainability 3 13 5 8 0
Total 41 150 90 59 1