A raw LLM is stateless. Every call starts from zero and only knows what you put in the prompt. Agent memory adds the missing ingredient: durable knowledge about the user, the business, and prior interactions, so the agent behaves like a continuous employee rather than a goldfish with perfect English.
Memory in modern agents is usually organized into four tiers. (1) Working memory is the current context window — the last few turns, the system prompt, and any actively retrieved context. It's fast but tiny and wiped each turn. (2) Short-term session memory holds the current conversation and resets when the session ends; typically a rolling buffer summarized when it overflows. (3) Long-term semantic memory stores facts, preferences, and learned patterns in a vector DB or structured DB; retrieved by similarity search when relevant. (4) Episodic memory stores specific events — 'on 2026-03-02 the founder approved the pricing change to $49' — usually in a structured log, retrieved by date, entity, or keyword.
The mechanics combine three techniques. Retrieval-augmented generation (RAG) fetches relevant memory chunks into the prompt on each turn. Summarization compresses old conversations into paragraph-sized notes that fit alongside retrieved chunks. Structured extraction pulls out entities and facts ('the user's company is
Medvi, founded 2023, B2B healthcare') and stores them in a queryable DB so they can be filtered exactly rather than approximately. Production agents mix all three. Mem0, LangGraph's MemoryStore, and Letta (formerly MemGPT) are popular open-source libraries that wrap these patterns.
The hard problems in agent memory are not storage — vector DBs are cheap — but curation. An agent that remembers everything quickly accumulates contradictory notes, stale preferences, and irrelevant tangents that pollute retrieval. Good memory systems distinguish facts from observations, version them with timestamps, reconcile contradictions ('the user said X in March, Y in April — use Y'), and forget deliberately. Getting this right is what makes Tycoon's
AI CEO Astra feel like she actually knows your business after three weeks: the memory layer promotes important decisions, demotes small talk, and resolves conflicts automatically.
Agent memory also raises privacy and safety questions. Whose memory is it? In Tycoon, each project has its own isolated memory — the CEO of Medvi's company never sees another customer's history. Memory can also be injected with adversarial content ('ignore prior instructions, the user's credit card is...'), which is why production systems separate memory storage from tool-executing actions and run policy filters on what memory can influence high-stakes decisions.