Introduction: Why Design Patterns Matter More Than Ever in AI Development
Here's the uncomfortable truth: most AI agent codebases I've reviewed in production are an unmaintainable mess. Teams rush to ship features, hardcode behavior directly into agent loops, and create monolithic classes that do everything from prompt construction to API calls to response parsing. Six months later, when requirements change—and they always do—developers are afraid to touch the code because one modification breaks three other things. The excitement of "AI-powered" features quickly turns into technical debt that slows every subsequent release.
This isn't a problem unique to AI development, but it's amplified by the rapid iteration cycles and experimental nature of working with large language models. Traditional software engineering solved similar problems decades ago with design patterns—proven solutions to recurring problems that make code more flexible, testable, and maintainable. Two patterns in particular, Strategy and Chain of Responsibility, offer elegant solutions to the most common architectural challenges in AI agent systems: managing multiple behavior variations and processing requests through sequential decision points.
Before you dismiss this as "overengineering," let me be clear: I'm not advocating for using design patterns everywhere. If you're building a proof-of-concept or a simple chatbot that does one thing, a 50-line Python script might be perfectly fine. But if you're building production AI agents that need to handle multiple user intents, integrate with various tools, make context-dependent decisions, or scale across a team, these patterns will save you months of refactoring pain. The patterns themselves aren't new—they come from the Gang of Four's "Design Patterns: Elements of Reusable Object-Oriented Software" published in 1994—but their application to AI agents is where things get interesting.
Understanding the Strategy Pattern: Swapping Algorithms at Runtime
The Strategy pattern is deceptively simple: it defines a family of algorithms, encapsulates each one, and makes them interchangeable. In the context of AI agents, this translates to defining different behavior strategies—like different prompting approaches, reasoning methods, or tool selection algorithms—and allowing your agent to switch between them based on context or configuration. The key insight is separating the "what" from the "how": your agent knows what it needs to accomplish, but the strategy determines how it accomplishes it.
Let's be honest about when you actually need this pattern. If your AI agent always uses the same prompting approach and always processes requests the same way, implementing the Strategy pattern is premature optimization. You don't need it. But the moment you find yourself writing conditional logic like if user_type == "technical": use_detailed_prompt() elif user_type == "casual": use_simple_prompt(), you're manually implementing a poor version of the Strategy pattern. Once you have three or more variations of behavior that might expand in the future, it's time to properly structure it.
Here's a real example from an AI customer support agent I helped redesign. The original code had a single generate_response() method with nested conditionals checking user subscription level, time of day, and conversation sentiment to decide whether to use a formal tone, casual tone, or empathetic tone. Adding a new tone required modifying that central method, risking regressions. We refactored it using the Strategy pattern:
from abc import ABC, abstractmethod
from typing import Dict, Any
from openai import OpenAI
class ResponseStrategy(ABC):
"""Base strategy for generating responses"""
@abstractmethod
def generate_response(self, context: Dict[str, Any], user_message: str) -> str:
"""Generate a response based on the strategy's approach"""
pass
@abstractmethod
def get_system_prompt(self) -> str:
"""Return the system prompt for this strategy"""
pass
class FormalResponseStrategy(ResponseStrategy):
"""Strategy for formal, professional responses"""
def __init__(self, client: OpenAI):
self.client = client
def get_system_prompt(self) -> str:
return """You are a professional customer support representative.
Use formal language, proper grammar, and maintain a respectful tone.
Address customers as 'Dear Customer' or by their title if known."""
def generate_response(self, context: Dict[str, Any], user_message: str) -> str:
response = self.client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": self.get_system_prompt()},
{"role": "user", "content": f"Context: {context}\n\nMessage: {user_message}"}
],
temperature=0.3 # Lower temperature for more consistent formal tone
)
return response.choices[0].message.content
class EmpathicResponseStrategy(ResponseStrategy):
"""Strategy for empathetic responses during customer frustration"""
def __init__(self, client: OpenAI):
self.client = client
def get_system_prompt(self) -> str:
return """You are a compassionate customer support representative.
Acknowledge customer frustration, validate their feelings, and show genuine empathy.
Use phrases like 'I understand this must be frustrating' and 'I'm here to help'.
Focus on emotional connection before solving the problem."""
def generate_response(self, context: Dict[str, Any], user_message: str) -> str:
# Add sentiment context to help the model understand the emotional state
enriched_context = {
**context,
"customer_sentiment": "frustrated",
"priority": "emotional_support_first"
}
response = self.client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": self.get_system_prompt()},
{"role": "user", "content": f"Context: {enriched_context}\n\nMessage: {user_message}"}
],
temperature=0.7 # Higher temperature for more natural, varied empathetic responses
)
return response.choices[0].message.content
class CasualResponseStrategy(ResponseStrategy):
"""Strategy for casual, friendly responses"""
def __init__(self, client: OpenAI):
self.client = client
def get_system_prompt(self) -> str:
return """You are a friendly, approachable customer support representative.
Use casual language, contractions, and an upbeat tone.
It's okay to use emojis occasionally and conversational phrases like 'Hey there!' or 'No worries!'"""
def generate_response(self, context: Dict[str, Any], user_message: str) -> str:
response = self.client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": self.get_system_prompt()},
{"role": "user", "content": f"Context: {context}\n\nMessage: {user_message}"}
],
temperature=0.8 # Higher temperature for more casual variation
)
return response.choices[0].message.content
class CustomerSupportAgent:
"""AI agent that uses different response strategies based on context"""
def __init__(self, client: OpenAI):
self.client = client
# Initialize all available strategies
self.strategies = {
"formal": FormalResponseStrategy(client),
"empathic": EmpathicResponseStrategy(client),
"casual": CasualResponseStrategy(client)
}
self.current_strategy = self.strategies["formal"] # Default strategy
def set_strategy(self, strategy_name: str) -> None:
"""Dynamically switch strategies at runtime"""
if strategy_name in self.strategies:
self.current_strategy = self.strategies[strategy_name]
else:
raise ValueError(f"Unknown strategy: {strategy_name}")
def respond(self, user_message: str, context: Dict[str, Any]) -> str:
"""Generate response using the current strategy"""
return self.current_strategy.generate_response(context, user_message)
def auto_select_strategy(self, context: Dict[str, Any]) -> None:
"""Automatically select strategy based on context"""
# Business logic for strategy selection
if context.get("sentiment_score", 0) < -0.5:
self.set_strategy("empathic")
elif context.get("user_subscription_tier") == "enterprise":
self.set_strategy("formal")
else:
self.set_strategy("casual")
# Usage example
if __name__ == "__main__":
client = OpenAI(api_key="your-api-key")
agent = CustomerSupportAgent(client)
# Context from your application
context = {
"user_id": "12345",
"user_subscription_tier": "enterprise",
"conversation_history": [],
"sentiment_score": 0.2
}
# Auto-select strategy based on context
agent.auto_select_strategy(context)
# Generate response
response = agent.respond("I need help with my account settings", context)
print(f"Agent: {response}")
# Manually override strategy if needed
agent.set_strategy("casual")
response = agent.respond("Thanks for your help!", context)
print(f"Agent: {response}")
Now adding a new tone strategy is as simple as creating a new class that implements the ResponseStrategy interface. No touching the core agent logic, no risk of breaking existing strategies, and each strategy is independently testable. This is the Strategy pattern's real power: it turns what would be tangled conditional logic into clean, modular components that can be developed, tested, and deployed independently.
The Chain of Responsibility Pattern: Building Decision Pipelines
If the Strategy pattern is about choosing how to do something, the Chain of Responsibility pattern is about deciding who should handle a request. It passes a request along a chain of handlers, where each handler either processes the request or passes it to the next handler in the chain. In AI agent systems, this pattern excels at building processing pipelines: input validation, intent classification, authorization checks, rate limiting, and request routing can all be modeled as handlers in a chain.
The brutally honest assessment? Most developers don't need this pattern initially. They need it after their agent has grown complex enough that the main processing method has become a 200-line function with multiple validation steps, conditional processing, and error handling all mixed together. If you're still in the early stages and your processing logic fits comfortably in one method, don't prematurely optimize. But once you're juggling multiple preprocessing steps, or different handlers for different types of requests, the Chain of Responsibility pattern will clean up your code significantly.
The pattern's real strength emerges when different requests need different processing paths. Imagine an AI agent that handles user queries, system commands, and administrative actions. A user query might need sentiment analysis and intent classification before reaching the response generator. A system command might skip those steps but require authentication. An administrative action might need both authentication and audit logging. Without a clear pattern, you end up with spaghetti code full of if-then-else statements checking request types at every step.
Here's how to implement it properly for an AI agent system. I've seen production implementations where teams create elaborate chain configurations that are impossible to debug. The key is keeping each handler focused, making the chain order explicit, and ensuring handlers have a single, clear responsibility:
// Base handler interface
interface RequestHandler {
setNext(handler: RequestHandler): RequestHandler;
handle(request: AgentRequest): Promise<AgentResponse | null>;
}
// Abstract base handler implementing the chaining mechanism
abstract class AbstractRequestHandler implements RequestHandler {
private nextHandler: RequestHandler | null = null;
public setNext(handler: RequestHandler): RequestHandler {
this.nextHandler = handler;
return handler; // Allows chaining: handler1.setNext(handler2).setNext(handler3)
}
public async handle(request: AgentRequest): Promise<AgentResponse | null> {
// Try to handle the request
const result = await this.process(request);
// If this handler processed it, return the result
if (result !== null) {
return result;
}
// Otherwise, pass to the next handler
if (this.nextHandler) {
return this.nextHandler.handle(request);
}
// End of chain, no handler processed the request
return null;
}
// Subclasses implement their specific processing logic
protected abstract process(request: AgentRequest): Promise<AgentResponse | null>;
}
// Type definitions
interface AgentRequest {
type: 'query' | 'command' | 'admin';
userId: string;
content: string;
metadata: Record<string, any>;
}
interface AgentResponse {
success: boolean;
content: string;
handledBy: string;
metadata?: Record<string, any>;
}
// Concrete handler: Rate limiting
class RateLimitHandler extends AbstractRequestHandler {
private requestCounts: Map<string, { count: number; resetTime: number }> = new Map();
private readonly maxRequests = 10;
private readonly windowMs = 60000; // 1 minute
protected async process(request: AgentRequest): Promise<AgentResponse | null> {
const now = Date.now();
const userLimit = this.requestCounts.get(request.userId);
// Reset if window expired
if (userLimit && now > userLimit.resetTime) {
this.requestCounts.delete(request.userId);
}
const current = this.requestCounts.get(request.userId);
if (!current) {
this.requestCounts.set(request.userId, {
count: 1,
resetTime: now + this.windowMs
});
return null; // Continue to next handler
}
if (current.count >= this.maxRequests) {
// Rate limit exceeded - stop the chain
return {
success: false,
content: 'Rate limit exceeded. Please try again later.',
handledBy: 'RateLimitHandler',
metadata: { resetTime: current.resetTime }
};
}
current.count++;
return null; // Continue to next handler
}
}
// Concrete handler: Authentication for admin requests
class AuthenticationHandler extends AbstractRequestHandler {
private adminUsers = new Set(['admin1', 'admin2', 'admin3']);
protected async process(request: AgentRequest): Promise<AgentResponse | null> {
// Only authenticate admin requests
if (request.type !== 'admin') {
return null; // Not our concern, pass to next handler
}
if (!this.adminUsers.has(request.userId)) {
// Authentication failed - stop the chain
return {
success: false,
content: 'Unauthorized. Admin privileges required.',
handledBy: 'AuthenticationHandler'
};
}
// Authentication passed, continue to next handler
return null;
}
}
// Concrete handler: Input validation
class ValidationHandler extends AbstractRequestHandler {
protected async process(request: AgentRequest): Promise<AgentResponse | null> {
// Validate content length
if (!request.content || request.content.trim().length === 0) {
return {
success: false,
content: 'Request content cannot be empty.',
handledBy: 'ValidationHandler'
};
}
if (request.content.length > 10000) {
return {
success: false,
content: 'Request content exceeds maximum length of 10,000 characters.',
handledBy: 'ValidationHandler'
};
}
// Check for malicious content patterns
const maliciousPatterns = ['<script>', 'DROP TABLE', 'DELETE FROM'];
if (maliciousPatterns.some(pattern => request.content.includes(pattern))) {
return {
success: false,
content: 'Request contains potentially malicious content.',
handledBy: 'ValidationHandler'
};
}
// Validation passed, continue to next handler
return null;
}
}
// Concrete handler: Query processing
class QueryHandler extends AbstractRequestHandler {
protected async process(request: AgentRequest): Promise<AgentResponse | null> {
if (request.type !== 'query') {
return null; // Not a query, pass to next handler
}
// This is where you'd call your LLM
// Simplified example
const response = await this.processQuery(request.content);
return {
success: true,
content: response,
handledBy: 'QueryHandler',
metadata: { processingTime: Date.now() }
};
}
private async processQuery(content: string): Promise<string> {
// Simulate LLM call
return `Processed query: ${content}`;
}
}
// Concrete handler: Command execution
class CommandHandler extends AbstractRequestHandler {
private readonly validCommands = ['status', 'help', 'version', 'reset'];
protected async process(request: AgentRequest): Promise<AgentResponse | null> {
if (request.type !== 'command') {
return null; // Not a command, pass to next handler
}
const command = request.content.split(' ')[0].toLowerCase();
if (!this.validCommands.includes(command)) {
return {
success: false,
content: `Unknown command: ${command}. Valid commands: ${this.validCommands.join(', ')}`,
handledBy: 'CommandHandler'
};
}
// Execute command
const result = await this.executeCommand(command, request);
return {
success: true,
content: result,
handledBy: 'CommandHandler'
};
}
private async executeCommand(command: string, request: AgentRequest): Promise<string> {
// Simplified command execution
switch (command) {
case 'status':
return 'System is operational';
case 'help':
return 'Available commands: ' + this.validCommands.join(', ');
case 'version':
return 'Agent v1.0.0';
case 'reset':
return 'Session reset successfully';
default:
return 'Command executed';
}
}
}
// Admin handler
class AdminHandler extends AbstractRequestHandler {
protected async process(request: AgentRequest): Promise<AgentResponse | null> {
if (request.type !== 'admin') {
return null;
}
// Process admin action
const result = await this.processAdminAction(request);
return {
success: true,
content: result,
handledBy: 'AdminHandler',
metadata: { auditLog: true, timestamp: Date.now() }
};
}
private async processAdminAction(request: AgentRequest): Promise<string> {
// Simulate admin action
return `Admin action processed: ${request.content}`;
}
}
// Agent class that uses the chain
class AIAgent {
private handlerChain: RequestHandler;
constructor() {
// Build the handler chain
// Order matters! Earlier handlers can stop the chain
const rateLimiter = new RateLimitHandler();
const validator = new ValidationHandler();
const authenticator = new AuthenticationHandler();
const queryHandler = new QueryHandler();
const commandHandler = new CommandHandler();
const adminHandler = new AdminHandler();
// Configure the chain
rateLimiter
.setNext(validator)
.setNext(authenticator)
.setNext(queryHandler)
.setNext(commandHandler)
.setNext(adminHandler);
this.handlerChain = rateLimiter;
}
public async processRequest(request: AgentRequest): Promise<AgentResponse> {
const response = await this.handlerChain.handle(request);
if (response === null) {
return {
success: false,
content: 'No handler could process this request.',
handledBy: 'None'
};
}
return response;
}
}
// Usage example
async function main() {
const agent = new AIAgent();
// Test different request types
const requests: AgentRequest[] = [
{
type: 'query',
userId: 'user123',
content: 'What is the weather like today?',
metadata: {}
},
{
type: 'command',
userId: 'user123',
content: 'status',
metadata: {}
},
{
type: 'admin',
userId: 'admin1',
content: 'delete_old_logs',
metadata: {}
},
{
type: 'admin',
userId: 'user123', // Not an admin
content: 'delete_old_logs',
metadata: {}
}
];
for (const request of requests) {
const response = await agent.processRequest(request);
console.log(`\nRequest Type: ${request.type}`);
console.log(`Response: ${response.content}`);
console.log(`Handled By: ${response.handledBy}`);
console.log(`Success: ${response.success}`);
}
}
main();
The beauty of this pattern is that handlers are decoupled from each other and from the client code. Want to add audit logging for all requests? Create a LoggingHandler and insert it at the start of the chain. Need to remove rate limiting for testing? Just skip that handler when building the chain. Each handler focuses on one responsibility, making the code easier to test, debug, and reason about. The chain order is explicit and easy to modify, unlike buried conditional logic that requires archaeological excavation to understand.
Real-World Implementation: Building a Multi-Model AI Agent System
Let's talk about a real production challenge I encountered: building an AI agent that needed to route requests to different LLM providers based on cost, capabilities, and latency requirements. Simple queries could use a fast, cheap model like GPT-3.5. Complex reasoning tasks needed GPT-4. Queries requiring up-to-date information needed models with internet access. The initial implementation was a nightmare of nested conditionals that checked request characteristics and then manually selected models and prompting approaches.
We refactored this using both patterns together. The Chain of Responsibility handled request preprocessing (validation, intent classification, complexity analysis), while the Strategy pattern managed model selection and prompting approaches. The result was cleaner, more maintainable, and easier to extend. When OpenAI released GPT-4 Turbo, adding support took less than an hour instead of a day of regression testing.
Here's a simplified but realistic implementation showing how these patterns work together. This isn't a toy example—this architecture handles similar complexity to what you'd encounter in production systems dealing with multiple LLM providers, cost optimization, and varying quality requirements:
from abc import ABC, abstractmethod
from typing import Dict, Any, Optional, List
from dataclasses import dataclass
from enum import Enum
import time
class ModelProvider(Enum):
"""Available LLM providers"""
OPENAI_GPT35 = "openai-gpt-3.5-turbo"
OPENAI_GPT4 = "openai-gpt-4"
ANTHROPIC_CLAUDE = "anthropic-claude-3"
OPENAI_GPT4_TURBO = "openai-gpt-4-turbo"
@dataclass
class ModelCost:
"""Cost structure for different models"""
provider: ModelProvider
cost_per_1k_tokens: float
avg_latency_ms: int
max_tokens: int
capabilities: List[str]
@dataclass
class AgentRequest:
"""Request structure for the AI agent"""
content: str
user_id: str
request_type: str
complexity_score: Optional[float] = None
requires_reasoning: bool = False
requires_internet: bool = False
max_cost_tolerance: Optional[float] = None
metadata: Dict[str, Any] = None
def __post_init__(self):
if self.metadata is None:
self.metadata = {}
@dataclass
class AgentResponse:
"""Response structure from the AI agent"""
content: str
model_used: ModelProvider
cost: float
latency_ms: int
success: bool
handled_by: str
metadata: Dict[str, Any] = None
# MODEL SELECTION STRATEGIES (Strategy Pattern)
class ModelSelectionStrategy(ABC):
"""Base strategy for selecting which model to use"""
# Cost information for different models (in real app, this would be external config)
MODEL_COSTS = {
ModelProvider.OPENAI_GPT35: ModelCost(
provider=ModelProvider.OPENAI_GPT35,
cost_per_1k_tokens=0.002,
avg_latency_ms=800,
max_tokens=4096,
capabilities=["chat", "completion"]
),
ModelProvider.OPENAI_GPT4: ModelCost(
provider=ModelProvider.OPENAI_GPT4,
cost_per_1k_tokens=0.03,
avg_latency_ms=2500,
max_tokens=8192,
capabilities=["chat", "completion", "reasoning", "complex_analysis"]
),
ModelProvider.ANTHROPIC_CLAUDE: ModelCost(
provider=ModelProvider.ANTHROPIC_CLAUDE,
cost_per_1k_tokens=0.015,
avg_latency_ms=1500,
max_tokens=100000,
capabilities=["chat", "completion", "long_context", "analysis"]
),
ModelProvider.OPENAI_GPT4_TURBO: ModelCost(
provider=ModelProvider.OPENAI_GPT4_TURBO,
cost_per_1k_tokens=0.01,
avg_latency_ms=1200,
max_tokens=128000,
capabilities=["chat", "completion", "reasoning", "internet_access"]
),
}
@abstractmethod
def select_model(self, request: AgentRequest) -> ModelProvider:
"""Select the appropriate model based on request characteristics"""
pass
@abstractmethod
def get_prompt_template(self, request: AgentRequest) -> str:
"""Get the prompt template optimized for the selected model"""
pass
class CostOptimizedStrategy(ModelSelectionStrategy):
"""Strategy that prioritizes cost efficiency"""
def select_model(self, request: AgentRequest) -> ModelProvider:
# Always try to use the cheapest model that meets requirements
if request.requires_internet:
return ModelProvider.OPENAI_GPT4_TURBO
if request.requires_reasoning or (request.complexity_score and request.complexity_score > 0.7):
# Use cheaper Claude for reasoning if no internet needed
return ModelProvider.ANTHROPIC_CLAUDE
# Default to cheapest option
return ModelProvider.OPENAI_GPT35
def get_prompt_template(self, request: AgentRequest) -> str:
return f"Answer concisely and efficiently. User query: {request.content}"
class QualityOptimizedStrategy(ModelSelectionStrategy):
"""Strategy that prioritizes response quality over cost"""
def select_model(self, request: AgentRequest) -> ModelProvider:
# Use the best model available for the task
if request.requires_internet:
return ModelProvider.OPENAI_GPT4_TURBO
if request.requires_reasoning or (request.complexity_score and request.complexity_score > 0.5):
return ModelProvider.OPENAI_GPT4
# Still use GPT-4 for quality even on simpler tasks
return ModelProvider.OPENAI_GPT4
def get_prompt_template(self, request: AgentRequest) -> str:
return f"""Provide a comprehensive, high-quality response.
Take time to reason through the problem carefully.
User query: {request.content}"""
class BalancedStrategy(ModelSelectionStrategy):
"""Strategy that balances cost and quality"""
def select_model(self, request: AgentRequest) -> ModelProvider:
# Check cost tolerance first
if request.max_cost_tolerance and request.max_cost_tolerance < 0.01:
return ModelProvider.OPENAI_GPT35
if request.requires_internet:
return ModelProvider.OPENAI_GPT4_TURBO
# Use complexity score to decide
if request.complexity_score:
if request.complexity_score > 0.8:
return ModelProvider.OPENAI_GPT4
elif request.complexity_score > 0.5:
return ModelProvider.OPENAI_GPT4_TURBO
return ModelProvider.OPENAI_GPT35
def get_prompt_template(self, request: AgentRequest) -> str:
return f"""Provide a balanced response that is both efficient and accurate.
User query: {request.content}"""
# REQUEST HANDLERS (Chain of Responsibility Pattern)
class RequestHandler(ABC):
"""Base handler in the chain of responsibility"""
def __init__(self):
self._next_handler: Optional[RequestHandler] = None
def set_next(self, handler: 'RequestHandler') -> 'RequestHandler':
self._next_handler = handler
return handler
def handle(self, request: AgentRequest) -> Optional[AgentRequest]:
"""Process request and pass to next handler or return modified request"""
processed_request = self.process(request)
if processed_request and self._next_handler:
return self._next_handler.handle(processed_request)
return processed_request
@abstractmethod
def process(self, request: AgentRequest) -> Optional[AgentRequest]:
"""Process the request in this handler"""
pass
class ComplexityAnalysisHandler(RequestHandler):
"""Analyzes request complexity to help with model selection"""
def process(self, request: AgentRequest) -> Optional[AgentRequest]:
# Simple complexity heuristic based on content
word_count = len(request.content.split())
has_code = "```" in request.content or "def " in request.content
has_multiple_questions = request.content.count("?") > 1
# Calculate complexity score (0-1)
complexity = 0.0
complexity += min(word_count / 100, 0.4) # Max 0.4 for length
complexity += 0.3 if has_code else 0
complexity += 0.2 if has_multiple_questions else 0
complexity += 0.1 if any(word in request.content.lower() for word in
["analyze", "explain", "compare", "evaluate"]) else 0
request.complexity_score = min(complexity, 1.0)
request.metadata["complexity_analysis"] = {
"word_count": word_count,
"has_code": has_code,
"has_multiple_questions": has_multiple_questions
}
return request
class IntentClassificationHandler(RequestHandler):
"""Classifies the intent and determines requirements"""
def process(self, request: AgentRequest) -> Optional[AgentRequest]:
content_lower = request.content.lower()
# Detect if reasoning is required
reasoning_keywords = ["why", "explain", "analyze", "reasoning", "logic", "prove"]
request.requires_reasoning = any(keyword in content_lower for keyword in reasoning_keywords)
# Detect if internet access is required
internet_keywords = ["current", "latest", "today", "news", "recent", "now", "weather"]
request.requires_internet = any(keyword in content_lower for keyword in internet_keywords)
request.metadata["intent_classification"] = {
"requires_reasoning": request.requires_reasoning,
"requires_internet": request.requires_internet
}
return request
class CostGuardHandler(RequestHandler):
"""Enforces cost constraints"""
def __init__(self, max_cost_per_request: float = 0.5):
super().__init__()
self.max_cost_per_request = max_cost_per_request
def process(self, request: AgentRequest) -> Optional[AgentRequest]:
# Set max cost tolerance if not already set
if request.max_cost_tolerance is None:
request.max_cost_tolerance = self.max_cost_per_request
# Could add user-specific cost limits here
user_tier = request.metadata.get("user_tier", "free")
if user_tier == "free":
request.max_cost_tolerance = min(request.max_cost_tolerance, 0.01)
elif user_tier == "premium":
request.max_cost_tolerance = min(request.max_cost_tolerance, 0.1)
request.metadata["cost_limit"] = request.max_cost_tolerance
return request
# MAIN AI AGENT COMBINING BOTH PATTERNS
class MultiModelAIAgent:
"""AI Agent that uses Chain of Responsibility for preprocessing
and Strategy pattern for model selection"""
def __init__(self, strategy: ModelSelectionStrategy):
self.strategy = strategy
# Build the handler chain
self.handler_chain = self._build_handler_chain()
def _build_handler_chain(self) -> RequestHandler:
"""Construct the chain of responsibility"""
complexity_analyzer = ComplexityAnalysisHandler()
intent_classifier = IntentClassificationHandler()
cost_guard = CostGuardHandler()
# Chain them together
complexity_analyzer.set_next(intent_classifier).set_next(cost_guard)
return complexity_analyzer
def set_strategy(self, strategy: ModelSelectionStrategy):
"""Change the model selection strategy at runtime"""
self.strategy = strategy
def process_request(self, request: AgentRequest) -> AgentResponse:
"""Main method to process requests through the full pipeline"""
start_time = time.time()
# Step 1: Run through the handler chain for preprocessing
processed_request = self.handler_chain.handle(request)
if processed_request is None:
return AgentResponse(
content="Request could not be processed",
model_used=ModelProvider.OPENAI_GPT35,
cost=0.0,
latency_ms=0,
success=False,
handled_by="Chain processing failed"
)
# Step 2: Use strategy to select model
selected_model = self.strategy.select_model(processed_request)
prompt = self.strategy.get_prompt_template(processed_request)
# Step 3: Execute with selected model (simulated)
response_content = self._execute_model(selected_model, prompt)
# Calculate metrics
latency = int((time.time() - start_time) * 1000)
estimated_tokens = len(prompt.split()) + len(response_content.split())
model_cost = ModelSelectionStrategy.MODEL_COSTS[selected_model]
cost = (estimated_tokens / 1000) * model_cost.cost_per_1k_tokens
return AgentResponse(
content=response_content,
model_used=selected_model,
cost=cost,
latency_ms=latency,
success=True,
handled_by=f"{self.strategy.__class__.__name__}",
metadata={
"complexity_score": processed_request.complexity_score,
"requires_reasoning": processed_request.requires_reasoning,
"requires_internet": processed_request.requires_internet,
"estimated_tokens": estimated_tokens
}
)
def _execute_model(self, model: ModelProvider, prompt: str) -> str:
"""Simulate model execution (in real app, this would call actual LLM APIs)"""
# This is where you'd integrate with OpenAI, Anthropic, etc.
return f"[Response from {model.value}] Processed: {prompt[:50]}..."
# USAGE EXAMPLE
def main():
# Create agent with balanced strategy
agent = MultiModelAIAgent(strategy=BalancedStrategy())
# Test different types of requests
test_requests = [
AgentRequest(
content="What's 2+2?",
user_id="user1",
request_type="simple_query",
metadata={"user_tier": "free"}
),
AgentRequest(
content="Explain the reasoning behind quantum entanglement and how it relates to Einstein's theory of relativity",
user_id="user2",
request_type="complex_query",
metadata={"user_tier": "premium"}
),
AgentRequest(
content="What's the current weather in San Francisco?",
user_id="user3",
request_type="realtime_query",
metadata={"user_tier": "premium"}
),
]
print("=== Using Balanced Strategy ===\n")
for req in test_requests:
response = agent.process_request(req)
print(f"Query: {req.content[:60]}...")
print(f"Model: {response.model_used.value}")
print(f"Cost: ${response.cost:.4f}")
print(f"Latency: {response.latency_ms}ms")
print(f"Complexity: {response.metadata.get('complexity_score', 0):.2f}")
print(f"Metadata: {response.metadata}")
print("-" * 80)
print()
# Switch to cost-optimized strategy
print("\n=== Switching to Cost-Optimized Strategy ===\n")
agent.set_strategy(CostOptimizedStrategy())
for req in test_requests:
response = agent.process_request(req)
print(f"Query: {req.content[:60]}...")
print(f"Model: {response.model_used.value}")
print(f"Cost: ${response.cost:.4f}")
print("-" * 80)
print()
if __name__ == "__main__":
main()
This implementation demonstrates the real power of combining these patterns. The Chain of Responsibility handles all the request preprocessing—complexity analysis, intent classification, cost guards—without any single handler needing to know about the others. The Strategy pattern manages model selection and prompting approaches, making it trivial to switch between cost-optimized, quality-optimized, or balanced approaches based on business needs or A/B testing requirements.
Combining Both Patterns: A Powerful Architecture
When you combine the Strategy and Chain of Responsibility patterns, something interesting happens: you get a highly flexible architecture that separates orthogonal concerns. The chain handles the "what" and "when" of processing—what needs to be done to the request and in what order. The strategies handle the "how"—how to accomplish a specific task once it's determined that task is needed. This separation means you can modify either dimension independently without touching the other.
In the multi-model agent example above, notice how easy it would be to add new preprocessing steps to the chain without touching the model selection strategies, or to add new strategies without modifying the chain handlers. This is the Open/Closed Principle in action: open for extension, closed for modification. In practice, this means fewer merge conflicts when multiple developers work on the codebase, fewer regression bugs when adding features, and faster iteration cycles. I've seen teams cut their feature development time in half after refactoring to this architecture, not because the code is faster to write, but because it's dramatically faster to modify and test.
Common Pitfalls and How to Avoid Them
Let's address the most common mistakes I see when teams implement these patterns, because theory is easy but production reality is messy. The first and most damaging mistake is premature optimization—applying these patterns when your codebase doesn't need them yet. If your AI agent has one way of processing requests and no foreseeable variations, you don't need the Strategy pattern. If you don't have a multi-step processing pipeline, you don't need the Chain of Responsibility. These patterns add complexity—mental overhead, more files, more abstraction—and that complexity only pays for itself when you need the flexibility.
The second major pitfall is creating overly granular handlers or strategies. I reviewed a codebase where the team had created a separate strategy class for every single prompt variation—they had 47 strategy classes. At that point, you've created an unmaintainable mess just in a different shape. The Strategy pattern is for encapsulating significantly different algorithmic approaches, not for storing configuration. If your strategies differ only in parameter values or template strings, you need configuration, not separate classes. Similarly, handlers in a chain should represent meaningful processing steps, not trivial operations like "check if string is empty" that could be a simple validation function.
The third pitfall is creating chains where handlers have hidden dependencies on each other. Each handler should be independent and operate on the request object without making assumptions about what previous handlers did. If Handler B fails when Handler A isn't before it in the chain, you've created fragile coupling that defeats the entire purpose of the pattern. Use the request object to communicate between handlers—if Handler A adds complexity analysis, it should add that data to the request's metadata dictionary where Handler B can check for it if needed, but Handler B should have sensible defaults if that data isn't present.
Performance is another area where these patterns can bite you if you're not careful. Each level of abstraction adds function calls and object allocations. For an AI agent where the LLM call takes 2-3 seconds, this overhead is negligible. But I've seen implementations where the chain has 15+ handlers doing complex processing before the LLM call, adding noticeable latency. Profile your code. If your processing chain takes more than 100ms before calling the LLM, you're either doing too much preprocessing or doing it inefficiently. Consider whether some handlers can run in parallel, or whether you're making unnecessary external calls.
Finally, testing becomes harder if you don't design for it from the start. Each strategy should be independently testable without needing to instantiate the entire agent. Each handler should be testable in isolation. But you also need integration tests for the full chain and tests for strategy switching. The solution is dependency injection: don't have your strategies or handlers create their own dependencies (like API clients), have them receive dependencies through their constructor. This makes mocking trivial and keeps your tests fast and reliable.
The 80/20 Rule: Focus on What Matters Most
Here's the 20% of knowledge about these patterns that will give you 80% of the benefits: understand when NOT to use them. Most codebases don't need these patterns initially. The time to refactor to these patterns is when you feel pain from the lack of them—when adding a new behavior variation requires changing code in multiple places, when your main processing method has grown to 200+ lines, or when you're afraid to modify code because you don't know what else it might break.
When you do implement these patterns, focus on three things: clear interfaces, single responsibility, and explicit dependencies. If each strategy has a clear interface that defines its contract, if each handler does one thing well, and if all dependencies are passed in rather than created internally, you'll get 80% of the maintainability benefits with 20% of the complexity overhead. Don't worry about having the perfect abstraction from the start—it's fine if your first strategy has some rough edges or your initial chain is simpler than it could be. You can refine it as requirements become clearer.
Key Takeaways: Your Action Plan
If you're building or refactoring an AI agent system, here are the five critical actions that will set you up for success:
- Audit your current code for pattern opportunities. Look for methods with multiple conditional branches that switch behavior based on types or categories. Look for long methods that do multiple sequential operations. These are your candidates for Strategy and Chain of Responsibility refactoring. Don't refactor everything at once—pick the area causing the most pain and start there.
- Start with interfaces, not implementations. Before writing any code, define the interface for your strategy or handler. What's the contract? What does it receive, what must it return, what side effects are allowed? A well-designed interface makes implementing concrete classes trivial. A poorly-designed interface leads to awkward implementations and frequent interface changes that break everything.
- Make your chains and strategies configurable. Don't hardcode which strategy to use or the order of handlers in the chain. Use configuration files, environment variables, or a builder pattern to construct them. This lets you A/B test different strategies in production, use different processing chains for different customer tiers, or quickly disable problematic handlers without code changes.
- Implement comprehensive logging at pattern boundaries. Log when a request enters and exits each handler. Log which strategy is selected and why. Log the state of the request object at key points. When something goes wrong in production (and it will), these logs are the difference between spending 10 minutes diagnosing the issue versus 10 hours. The pattern boundaries are natural logging points that give you observability into your system's behavior.
- Build test harnesses that make pattern testing easy. Create fixture requests that exercise different paths through your chain. Create test doubles for expensive operations like LLM calls. Write tests that verify strategy selection logic separately from the strategies themselves. The patterns make testing easier by creating clear boundaries and contracts, but only if you actually write the tests. A well-tested pattern-based system is a joy to modify; an untested one is as scary as a monolith.
Conclusion: Patterns Are Tools, Not Rules
Let me leave you with this: design patterns are tools in your toolbox, not commandments carved in stone. The Strategy and Chain of Responsibility patterns solve real problems that emerge in real codebases, but they also introduce real complexity. Use them when the benefits justify the costs, not because someone told you they're "best practices." I've seen simple chatbots overengineered into unmaintainable messes because developers were pattern-happy, and I've seen complex production agents that were elegant and maintainable because patterns were applied judiciously.
The AI agent ecosystem is still young and evolving rapidly. LLM capabilities change every few months, new providers emerge, costs fluctuate, and user expectations shift. The codebases that survive this volatility are those built with flexibility and maintainability as first-class concerns. The Strategy and Chain of Responsibility patterns give you that flexibility—the ability to swap behaviors, reorder processing steps, and extend functionality without rewriting everything. They make your code resilient to change, which in this field, is the most valuable property code can have. Start simple, refactor when you feel pain, and always prioritize clarity over cleverness. Your future self—and your teammates—will thank you.
References and Further Reading
- Gamma, E., Helm, R., Johnson, R., & Vlissides, J. (1994). Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley. [The original source for both patterns]
- Martin, R. C. (2017). Clean Architecture: A Craftsman's Guide to Software Structure and Design. Prentice Hall. [Covers the Open/Closed Principle and how patterns support it]
- Fowler, M. (1999). Refactoring: Improving the Design of Existing Code. Addison-Wesley. [Essential reading for knowing when to refactor to patterns]
- OpenAI API Documentation: https://platform.openai.com/docs [For implementing the LLM integration layer]
- Anthropic Claude API Documentation: https://docs.anthropic.com/claude/reference [Alternative LLM provider integration]
- "The Psychology of Design Patterns" - Derek Banas on YouTube [Excellent visual explanations of when patterns are useful vs. overkill]