Agentic Chicago • Learn • Series Part 2 of 3

Building a General AI Agent, Part 2: Memory Retrieval Architecture

In Part 1 we covered startup context. Part 2 covers the other half: where deeper knowledge should live and how the agent should retrieve only what is relevant for the task at hand.

Practical memory model

Layer 1: daily operational notes for raw event history.
Layer 2: durable curated memory for reusable long-term context.

Skill hierarchy is memory architecture too

Startup context should reference high-level router skills. Routers then load domain routers and specialist skills. This staged loading keeps irrelevant context out of the active task.

Long-term retrieval stack

Use hybrid retrieval for best reliability:

SQLite full-text with BM25 for exact tokens.
Embedding search for semantic matches.
Hybrid scoring to combine precision and recall.

Important: BM25 is not embeddings. They solve different retrieval problems.

Retrieval flow

Start with startup context and guardrails.
Classify task domain and intent.
Retrieve only domain-relevant memory slice.
Ground output in source-path evidence.
Write back only meaningful deltas.

Bare-bones checklist

Two-layer memory model (daily plus durable).
Delegated domain knowledge files.
Hierarchical skills (router to specialist).
Hybrid retrieval (BM25 plus embeddings).
Citation and writeback policy.

Want this memory architecture implemented?

If you want this wired end-to-end in your stack, book a strategy session.

Book a Session Continue to Part 3

Previous: Part 1 Next: Part 3