Chapter 7: Memory - The Agent’s Notebook
In Chapter 6: ChatCompletionContext, we saw how agents manage the short-term history of a single conversation before talking to an LLM. It’s like remembering what was just said in the last few minutes.
But what if an agent needs to remember things for much longer, across multiple conversations or tasks? For example, imagine an assistant agent that learns your preferences:
- You tell it: “Please always write emails in a formal style for me.”
- Weeks later, you ask it to draft a new email.
How does it remember that preference? The short-term ChatCompletionContext
might have forgotten the earlier instruction, especially if using a strategy like BufferedChatCompletionContext
. The agent needs a long-term memory.
This is where the Memory
abstraction comes in. Think of it as the agent’s long-term notebook or database. While ChatCompletionContext
is the scratchpad for the current chat, Memory
holds persistent information the agent can add to or look up later.
Motivation: Remembering Across Conversations
Our goal is to give an agent the ability to store a piece of information (like a user preference) and retrieve it later to influence its behavior, even in a completely new conversation. Memory
provides the mechanism for this long-term storage and retrieval.
Key Concepts: How the Notebook Works
- What it Stores (
MemoryContent
): Agents can store various types of information in their memory. This could be:- Plain text notes (
text/plain
) - Structured data like JSON (
application/json
) - Even images (
image/*
) Each piece of information is wrapped in aMemoryContent
object, which includes the data itself, its type (mime_type
), and optional descriptivemetadata
.
# From: memory/_base_memory.py (Simplified Concept) from pydantic import BaseModel from typing import Any, Dict, Union # Represents one entry in the memory notebook class MemoryContent(BaseModel): content: Union[str, bytes, Dict[str, Any]] # The actual data mime_type: str # What kind of data (e.g., "text/plain") metadata: Dict[str, Any] | None = None # Extra info (optional)
This standard format helps manage different kinds of memories.
- Plain text notes (
-
Adding to Memory (
add
): When an agent learns something important it wants to remember long-term (like the user’s preferred style), it uses thememory.add(content)
method. This is like writing a new entry in the notebook. -
Querying Memory (
query
): When an agent needs to recall information, it can usememory.query(query_text)
. This is like searching the notebook for relevant entries. How the search works depends on the specific memory implementation (it could be a simple text match, or a sophisticated vector search in more advanced memories). - Updating Chat Context (
update_context
): This is a crucial link! Before an agent talks to the LLM (using theChatCompletionClient
from Chapter 5), it can usememory.update_context(chat_context)
method. This method:- Looks at the current conversation (
chat_context
). - Queries the long-term memory (
Memory
) for relevant information. - Injects the retrieved memories into the
chat_context
, often as aSystemMessage
. This way, the LLM gets the benefit of the long-term memory in addition to the short-term conversation history, right before generating its response.
- Looks at the current conversation (
- Different Memory Implementations: Just like there are different
ChatCompletionContext
strategies, there can be differentMemory
implementations:ListMemory
: A very simple memory that stores everything in a Python list (like a simple chronological notebook).- Future Possibilities: More advanced implementations could use databases or vector stores for more efficient storage and retrieval of vast amounts of information.
Use Case Example: Remembering User Preferences with ListMemory
Let’s implement our user preference use case using the simple ListMemory
.
Goal:
- Create a
ListMemory
. - Add a user preference (“formal style”) to it.
- Start a new chat context.
- Use
update_context
to inject the preference into the new chat context. - Show how the chat context looks before being sent to the LLM.
Step 1: Create the Memory
We’ll use ListMemory
, the simplest implementation provided by AutoGen Core.
# File: create_list_memory.py
from autogen_core.memory import ListMemory
# Create a simple list-based memory instance
user_prefs_memory = ListMemory(name="user_preferences")
print(f"Created memory: {user_prefs_memory.name}")
print(f"Initial content: {user_prefs_memory.content}")
# Output:
# Created memory: user_preferences
# Initial content: []
We have an empty memory notebook named “user_preferences”.
Step 2: Add the Preference
Let’s add the user’s preference as a piece of text memory.
# File: add_preference.py
import asyncio
from autogen_core.memory import MemoryContent
# Assume user_prefs_memory exists from the previous step
# Define the preference as MemoryContent
preference = MemoryContent(
content="User prefers all communication to be written in a formal style.",
mime_type="text/plain", # It's just text
metadata={"source": "user_instruction_conversation_1"} # Optional info
)
async def add_to_memory():
# Add the content to our memory instance
await user_prefs_memory.add(preference)
print(f"Memory content after adding: {user_prefs_memory.content}")
asyncio.run(add_to_memory())
# Output (will show the MemoryContent object):
# Memory content after adding: [MemoryContent(content='User prefers...', mime_type='text/plain', metadata={'source': '...'})]
We’ve successfully written the preference into our ListMemory
notebook.
Step 3: Start a New Chat Context
Imagine time passes, and the user starts a new conversation asking for an email draft. We create a fresh ChatCompletionContext
.
# File: start_new_chat.py
from autogen_core.model_context import UnboundedChatCompletionContext
from autogen_core.models import UserMessage
# Start a new, empty chat context for a new task
new_chat_context = UnboundedChatCompletionContext()
# Add the user's new request
new_request = UserMessage(content="Draft an email to the team about the Q3 results.", source="User")
# await new_chat_context.add_message(new_request) # In a real app, add the request
print("Created a new, empty chat context.")
# Output: Created a new, empty chat context.
This context currently doesn’t know about the “formal style” preference stored in our long-term memory.
Step 4: Inject Memory into Chat Context
Before sending the new_chat_context
to the LLM, we use update_context
to bring in relevant long-term memories.
# File: update_chat_with_memory.py
import asyncio
# Assume user_prefs_memory exists (with the preference added)
# Assume new_chat_context exists (empty or with just the new request)
# Assume new_request exists
async def main():
# --- This is where Memory connects to Chat Context ---
print("Updating chat context with memory...")
update_result = await user_prefs_memory.update_context(new_chat_context)
print(f"Memories injected: {len(update_result.memories.results)}")
# Now let's add the actual user request for this task
await new_chat_context.add_message(new_request)
# See what messages are now in the context
messages_for_llm = await new_chat_context.get_messages()
print("\nMessages to be sent to LLM:")
for msg in messages_for_llm:
print(f"- [{msg.type}]: {msg.content}")
asyncio.run(main())
Expected Output:
Updating chat context with memory...
Memories injected: 1
Messages to be sent to LLM:
- [SystemMessage]:
Relevant memory content (in chronological order):
1. User prefers all communication to be written in a formal style.
- [UserMessage]: Draft an email to the team about the Q3 results.
Look! The ListMemory.update_context
method automatically queried the memory (in this simple case, it just takes all entries) and added a SystemMessage
to the new_chat_context
. This message explicitly tells the LLM about the stored preference before it sees the user’s request to draft the email.
Step 5: (Conceptual) Sending to LLM
Now, if we were to send messages_for_llm
to the ChatCompletionClient
(Chapter 5):
# Conceptual code - Requires a configured client
# response = await llm_client.create(messages=messages_for_llm)
The LLM would receive both the instruction about the formal style preference (from Memory) and the request to draft the email. It’s much more likely to follow the preference now!
Step 6: Direct Query (Optional)
We can also directly query the memory if needed, without involving a chat context.
# File: query_memory.py
import asyncio
# Assume user_prefs_memory exists
async def main():
# Query the memory (ListMemory returns all items regardless of query text)
query_result = await user_prefs_memory.query("style preference")
print("\nDirect query result:")
for item in query_result.results:
print(f"- Content: {item.content}, Type: {item.mime_type}")
asyncio.run(main())
# Output:
# Direct query result:
# - Content: User prefers all communication to be written in a formal style., Type: text/plain
This shows how an agent could specifically look things up in its notebook.
Under the Hood: How ListMemory
Injects Context
Let’s trace the update_context
call for ListMemory
.
Conceptual Flow:
sequenceDiagram
participant AgentLogic as Agent Logic
participant ListMem as ListMemory
participant InternalList as Memory's Internal List
participant ChatCtx as ChatCompletionContext
AgentLogic->>+ListMem: update_context(chat_context)
ListMem->>+InternalList: Get all stored MemoryContent items
InternalList-->>-ListMem: Return list of [pref_content]
alt Memory list is NOT empty
ListMem->>ListMem: Format memories into a single string (e.g., "1. pref_content")
ListMem->>ListMem: Create SystemMessage with formatted string
ListMem->>+ChatCtx: add_message(SystemMessage)
ChatCtx-->>-ListMem: Context updated
end
ListMem->>ListMem: Create UpdateContextResult(memories=[pref_content])
ListMem-->>-AgentLogic: Return UpdateContextResult
- The agent calls
user_prefs_memory.update_context(new_chat_context)
. - The
ListMemory
instance accesses its internal_contents
list. - It checks if the list is empty. If not:
- It iterates through the
MemoryContent
items in the list. - It formats them into a numbered string (like “Relevant memory content…\n1. Item 1\n2. Item 2…”).
- It creates a single
SystemMessage
containing this formatted string. - It calls
new_chat_context.add_message()
to add thisSystemMessage
to the chat history that will be sent to the LLM. - It returns an
UpdateContextResult
containing the list of memories it just processed.
Code Glimpse:
-
Memory
Protocol (memory/_base_memory.py
): Defines the required methods for any memory implementation.# From: memory/_base_memory.py (Simplified ABC) from abc import ABC, abstractmethod # ... other imports: MemoryContent, MemoryQueryResult, UpdateContextResult, ChatCompletionContext class Memory(ABC): component_type = "memory" @abstractmethod async def update_context(self, model_context: ChatCompletionContext) -> UpdateContextResult: ... @abstractmethod async def query(self, query: str | MemoryContent, ...) -> MemoryQueryResult: ... @abstractmethod async def add(self, content: MemoryContent, ...) -> None: ... @abstractmethod async def clear(self) -> None: ... @abstractmethod async def close(self) -> None: ...
Any class wanting to act as Memory must provide these methods.
-
ListMemory
Implementation (memory/_list_memory.py
):# From: memory/_list_memory.py (Simplified) from typing import List # ... other imports: Memory, MemoryContent, ..., SystemMessage, ChatCompletionContext class ListMemory(Memory): def __init__(self, ..., memory_contents: List[MemoryContent] | None = None): # Stores memory items in a simple list self._contents: List[MemoryContent] = memory_contents or [] async def add(self, content: MemoryContent, ...) -> None: """Add new content to the internal list.""" self._contents.append(content) async def query(self, query: str | MemoryContent = "", ...) -> MemoryQueryResult: """Return all memories, ignoring the query.""" # Simple implementation: just return everything return MemoryQueryResult(results=self._contents) async def update_context(self, model_context: ChatCompletionContext) -> UpdateContextResult: """Add all memories as a SystemMessage to the chat context.""" if not self._contents: # Do nothing if memory is empty return UpdateContextResult(memories=MemoryQueryResult(results=[])) # Format all memories into a numbered list string memory_strings = [f"{i}. {str(mem.content)}" for i, mem in enumerate(self._contents, 1)] memory_context_str = "Relevant memory content...\n" + "\n".join(memory_strings) + "\n" # Add this string as a SystemMessage to the provided chat context await model_context.add_message(SystemMessage(content=memory_context_str)) # Return info about which memories were added return UpdateContextResult(memories=MemoryQueryResult(results=self._contents)) # ... clear(), close(), config methods ...
This shows the straightforward logic of
ListMemory
: store in a list, retrieve the whole list, and inject the whole list as a single system message into the chat context. More complex memories might use smarter retrieval (e.g., based on thequery
inquery()
or the last message inupdate_context
) and inject memories differently.
Next Steps
You’ve learned about Memory
, AutoGen Core’s mechanism for giving agents long-term recall beyond the immediate conversation (ChatCompletionContext
). We saw how MemoryContent
holds information, add
stores it, query
retrieves it, and update_context
injects relevant memories into the LLM’s working context. We explored the simple ListMemory
as a basic example.
Memory systems are crucial for agents that learn, adapt, or need to maintain state across interactions.
This concludes our deep dive into the core abstractions of AutoGen Core! We’ve covered Agents, Messaging, Runtime, Tools, LLM Clients, Chat Context, and now Memory. There’s one final concept that ties many of these together from a configuration perspective:
- Chapter 8: Component: Understand the general
Component
model in AutoGen Core, how it allows pieces likeMemory
,ChatCompletionContext
, andChatCompletionClient
to be configured and managed consistently.
Generated by AI Codebase Knowledge Builder