Agentic Chicago • Learn • Series Part 2 of 3
Building a General AI Agent, Part 2: Memory Retrieval Architecture
In Part 1 we covered startup context. Part 2 covers the other half: where deeper knowledge should live and how the agent should retrieve only what is relevant for the task at hand.
Practical memory model
- Layer 1: daily operational notes for raw event history.
- Layer 2: durable curated memory for reusable long-term context.
Skill hierarchy is memory architecture too
Startup context should reference high-level router skills. Routers then load domain routers and specialist skills. This staged loading keeps irrelevant context out of the active task.
Long-term retrieval stack
Use hybrid retrieval for best reliability:
- SQLite full-text with BM25 for exact tokens.
- Embedding search for semantic matches.
- Hybrid scoring to combine precision and recall.
Important: BM25 is not embeddings. They solve different retrieval problems.
Retrieval flow
- Start with startup context and guardrails.
- Classify task domain and intent.
- Retrieve only domain-relevant memory slice.
- Ground output in source-path evidence.
- Write back only meaningful deltas.
Bare-bones checklist
- Two-layer memory model (daily plus durable).
- Delegated domain knowledge files.
- Hierarchical skills (router to specialist).
- Hybrid retrieval (BM25 plus embeddings).
- Citation and writeback policy.
Want this memory architecture implemented?
If you want this wired end-to-end in your stack, book a strategy session.