Paul Serban | Software Engineer

CAP theorem analogies for AI engineers: a complete mapping from distributed systems to AI architectureEvery CAP concept translated and their counterparts in model behaviour, RAG design, and agentic systems

Discover how the CAP theorem translates to AI engineering. Learn to balance consistency, availability, and partition tolerance in LLMs, RAG, and agentic systems.

CQRS in AI systems: why separating reads from writes is the mental model prompt engineers have been missingHow a battle-tested software architecture pattern maps surprisingly cleanly onto inference pipelines, prompt design, and AI system boundaries

Explore how CQRS analogies apply to AI engineering — separating retrieval from mutation logic to build more reliable, observable AI systems.

LLM Integrations in Practice: Architecture Patterns, Pitfalls, and Anti-PatternsHow to integrate large language models into real systems without creating fragile, expensive messes

Integrating LLMs into production systems is an engineering problem, not a demo exercise. This post covers proven integration patterns, common mistakes, and what not to build with LLMs.

Project Idea: Customer Support Agent That Remembers Policies Without Leaking DataA realistic blueprint for memory scoping, redaction, and retrieval with auditability.

Create a production-minded support agent using short-term session memory and long-term policy RAG, with scoped retrieval, PII redaction, and audit logs to prevent data leaks.

Project Idea: Personal Knowledge OS with Long-Term Memory + RAGTurn notes, PDFs, bookmarks, and emails into a searchable, citeable assistant you control.

Build a real-life AI project that ingests your documents into a RAG index with long-term memory, delivering grounded answers with citations and strong privacy controls.

Quickstart: Build a Memory-Enabled AI Assistant with RAG in a WeekendA minimal architecture that scales: ingestion, retrieval, conversation state, and observability.

Follow a practical quickstart to create a memory-enabled AI assistant using RAG, including ingestion, indexing, conversation state, caching, and basic monitoring.

RAG 101 for AI Engineers: From Naive Retrieval to Production-Grade PipelinesChunking, embeddings, reranking, citations, evaluation, and failure modes explained simply.

A step-by-step guide to building a reliable RAG system, covering chunking, embeddings, retrieval, reranking, context windows, and evaluation tactics for better answers.

RAG in the Real World: Handling Fresh Data, Conflicts, and Source TrustWhat breaks in production and how to fix it with metadata, ranking, and policy.

Discover how to operate RAG systems with changing documents, conflicting sources, and varying trust levels using metadata filters, ranking, citations, and governance.

Understanding Amazon Bedrock Fundamentals: A Complete Guide for DevelopersMaster the core concepts, architecture patterns, and essential components that power Amazon Bedrock

Explore Amazon Bedrock fundamentals including architecture, agent lifecycle management, and core components to build robust AI-driven applications efficiently.

#RAG

Posts