Agentic Chicago • Learn • Series Part 2 of 3

Building a General AI Agent, Part 2: Memory Retrieval Architecture

In Part 1 we covered startup context. Part 2 covers the other half: where deeper knowledge should live and how the agent should retrieve only what is relevant for the task at hand.

Practical memory model

  • Layer 1: daily operational notes for raw event history.
  • Layer 2: durable curated memory for reusable long-term context.

Skill hierarchy is memory architecture too

Startup context should reference high-level router skills. Routers then load domain routers and specialist skills. This staged loading keeps irrelevant context out of the active task.

Long-term retrieval stack

Use hybrid retrieval for best reliability:

  1. SQLite full-text with BM25 for exact tokens.
  2. Embedding search for semantic matches.
  3. Hybrid scoring to combine precision and recall.

Important: BM25 is not embeddings. They solve different retrieval problems.

Retrieval flow

  1. Start with startup context and guardrails.
  2. Classify task domain and intent.
  3. Retrieve only domain-relevant memory slice.
  4. Ground output in source-path evidence.
  5. Write back only meaningful deltas.

Bare-bones checklist

  • Two-layer memory model (daily plus durable).
  • Delegated domain knowledge files.
  • Hierarchical skills (router to specialist).
  • Hybrid retrieval (BM25 plus embeddings).
  • Citation and writeback policy.

Want this memory architecture implemented?

If you want this wired end-to-end in your stack, book a strategy session.

Previous: Part 1Next: Part 3