Chapter 2: Message / Memory - Remembering the Conversation

In Chapter 1: The LLM - Your Agent’s Brainpower, we learned how our agent uses the LLM class to access its “thinking” capabilities. But just like humans, an agent needs to remember what was said earlier in a conversation to make sense of new requests and respond appropriately.

Imagine asking a friend: “What was the first thing I asked you?”. If they have no memory, they can’t answer! Agents face the same problem. They need a way to store the conversation history.

This is where Message and Memory come in.

What Problem Do They Solve?

Think about a simple chat:

You: “What’s the weather like in London?”
Agent: “It’s currently cloudy and 15°C in London.”
You: “What about Paris?”

For the agent to answer your second question (“What about Paris?”), it needs to remember that the topic of the conversation is “weather”. Without remembering the first question, the second question is meaningless.

Message and Memory provide the structure to:

Represent each individual turn (like your question or the agent’s answer) clearly.
Store these turns in order, creating a log of the conversation.

The Key Concepts: Message and Memory

Let’s break these down:

1. Message: A Single Turn in the Chat

A Message object is like a single speech bubble in a chat interface. It represents one specific thing said by someone (or something) at a particular point in the conversation.

Every Message has two main ingredients:

role: Who sent this message? This is crucial for the LLM to understand the flow. Common roles are:
- user: A message from the end-user interacting with the agent. (e.g., “What’s the weather?”)
- assistant: A message from the agent/LLM. (e.g., “The weather is sunny.”)
- system: An initial instruction to guide the agent’s overall behavior. (e.g., “You are a helpful weather assistant.”)
- tool: The output or result from a Tool / ToolCollection that the agent used. (e.g., The raw data returned by a weather API tool).
content: What was said? This is the actual text of the message. (e.g., “What’s the weather like in London?”)

There are also optional parts for more advanced uses, like tool_calls (when the assistant decides to use a tool) or base64_image (if an image is included in the message), but role and content are the basics.

2. Memory: The Conversation Log

The Memory object is simply a container, like a list or a notebook, that holds a sequence of Message objects.

It keeps track of the entire conversation history (or at least the recent parts).
It stores messages in the order they occurred.
Agents look at the Memory before deciding what to do next, giving them context.

Think of Memory as the agent’s short-term memory for the current interaction.

How Do We Use Them?

Let’s see how you’d typically work with Message and Memory in OpenManus (often, the agent framework handles some of this automatically, but it’s good to understand the pieces).

1. Creating Messages:

The Message class in app/schema.py provides handy shortcuts to create messages with the correct role:

# Import the Message class
from app.schema import Message

# Create a message from the user
user_q = Message.user_message("What's the capital of France?")

# Create a message from the assistant (agent's response)
assistant_a = Message.assistant_message("The capital of France is Paris.")

# Create a system instruction
system_instruction = Message.system_message("You are a helpful geography expert.")

print(f"User Message: Role='{user_q.role}', Content='{user_q.content}'")
print(f"Assistant Message: Role='{assistant_a.role}', Content='{assistant_a.content}'")

Explanation:

We import Message from app/schema.py.
Message.user_message("...") creates a Message object with role set to user.
Message.assistant_message("...") creates one with role set to assistant.
Message.system_message("...") creates one with role set to system.
Each of these returns a Message object containing the role and the text content you provided.

Example Output:

User Message: Role='user', Content='What's the capital of France?'
Assistant Message: Role='assistant', Content='The capital of France is Paris.'

2. Storing Messages in Memory:

The Memory class (app/schema.py) holds these messages. Agents usually have a memory attribute.

# Import Memory and Message
from app.schema import Message, Memory

# Create a Memory instance
conversation_memory = Memory()

# Add messages to the memory
conversation_memory.add_message(
    Message.system_message("You are a helpful geography expert.")
)
conversation_memory.add_message(
    Message.user_message("What's the capital of France?")
)
conversation_memory.add_message(
    Message.assistant_message("The capital of France is Paris.")
)
conversation_memory.add_message(
    Message.user_message("What about Spain?")
)


# See the messages stored
print(f"Number of messages in memory: {len(conversation_memory.messages)}")
# Print the last message
print(f"Last message: {conversation_memory.messages[-1].to_dict()}")

Explanation:

We import Memory and Message.
conversation_memory = Memory() creates an empty memory store.
conversation_memory.add_message(...) adds a Message object to the end of the internal list.
conversation_memory.messages gives you access to the list of Message objects currently stored.
message.to_dict() converts a Message object into a simple dictionary format, which is often needed for APIs.

Example Output:

Number of messages in memory: 4
Last message: {'role': 'user', 'content': 'What about Spain?'}

3. Using Memory for Context:

Now, how does the agent use this? Before calling the LLM to figure out the answer to “What about Spain?”, the agent would grab the messages from its Memory.

# (Continuing from previous example)

# Agent prepares to ask the LLM
messages_for_llm = conversation_memory.to_dict_list()

print("Messages being sent to LLM for context:")
for msg in messages_for_llm:
    print(f"- {msg}")

# Simplified: Agent would now pass 'messages_for_llm' to llm.ask(...)
# response = await agent.llm.ask(messages=messages_for_llm)
# print(f"LLM would likely respond about the capital of Spain, e.g., 'The capital of Spain is Madrid.'")

Explanation:

conversation_memory.to_dict_list() converts all stored Message objects into the list-of-dictionaries format that the llm.ask method expects (as we saw in Chapter 1).
By sending this entire history, the LLM sees:
1. Its instructions (“You are a helpful geography expert.”)
2. The first question (“What’s the capital of France?”)
3. Its previous answer (“The capital of France is Paris.”)
4. The new question (“What about Spain?”)
With this context, the LLM can correctly infer that “What about Spain?” means “What is the capital of Spain?”.

Under the Hood: How It Works

Memory is conceptually simple. It’s primarily a wrapper around a standard Python list, ensuring messages are stored correctly and providing convenient methods.

Here’s a simplified flow of how an agent uses memory:

sequenceDiagram
    participant User
    participant Agent as BaseAgent (app/agent/base.py)
    participant Mem as Memory (app/schema.py)
    participant LLM as LLM Class (app/llm.py)
    participant LLM_API as Actual LLM API

    User->>+Agent: Sends message ("What about Spain?")
    Agent->>+Mem: update_memory(role="user", content="What about Spain?")
    Mem->>Mem: Adds Message(role='user', ...) to internal list
    Mem-->>-Agent: Memory updated
    Agent->>Agent: Needs to generate response
    Agent->>+Mem: Get all messages (memory.messages)
    Mem-->>-Agent: Returns list of Message objects
    Agent->>Agent: Formats messages to dict list (memory.to_dict_list())
    Agent->>+LLM: ask(messages=formatted_list)
    LLM->>LLM_API: Sends request with history
    LLM_API-->>LLM: Receives response ("The capital is Madrid.")
    LLM-->>-Agent: Returns text response
    Agent->>+Mem: update_memory(role="assistant", content="The capital is Madrid.")
    Mem->>Mem: Adds Message(role='assistant', ...) to internal list
    Mem-->>-Agent: Memory updated
    Agent->>-User: Sends response ("The capital is Madrid.")

Code Glimpse:

Let’s look at the core parts in app/schema.py:

# Simplified snippet from app/schema.py

from typing import List, Optional
from pydantic import BaseModel, Field

# (Role enum and other definitions are here)

class Message(BaseModel):
    role: str # Simplified: In reality uses ROLE_TYPE Literal
    content: Optional[str] = None
    # ... other optional fields like tool_calls, name, etc.

    def to_dict(self) -> dict:
        # Creates a dictionary representation, skipping None values
        message_dict = {"role": self.role}
        if self.content is not None:
            message_dict["content"] = self.content
        # ... add other fields if they exist ...
        return message_dict

    @classmethod
    def user_message(cls, content: str) -> "Message":
        return cls(role="user", content=content)

    @classmethod
    def assistant_message(cls, content: Optional[str]) -> "Message":
        return cls(role="assistant", content=content)

    # ... other classmethods like system_message, tool_message ...

class Memory(BaseModel):
    messages: List[Message] = Field(default_factory=list)
    max_messages: int = 100 # Example limit

    def add_message(self, message: Message) -> None:
        """Add a single message to the list."""
        self.messages.append(message)
        # Optional: Trim old messages if limit exceeded
        if len(self.messages) > self.max_messages:
            self.messages = self.messages[-self.max_messages :]

    def to_dict_list(self) -> List[dict]:
        """Convert all stored messages to dictionaries."""
        return [msg.to_dict() for msg in self.messages]

    # ... other methods like clear(), get_recent_messages() ...

Explanation:

The Message class uses Pydantic BaseModel for structure and validation. It clearly defines role and content. The classmethods (user_message, etc.) are just convenient ways to create instances with the role pre-filled. to_dict prepares it for API calls.
The Memory class also uses BaseModel. Its main part is messages: List[Message], which holds the conversation history. add_message simply appends to this list (and optionally trims it). to_dict_list iterates through the stored messages and converts each one using its to_dict method.

And here’s how an agent might use its memory attribute (simplified from app/agent/base.py):

# Simplified conceptual snippet inspired by app/agent/base.py

from app.schema import Memory, Message, ROLE_TYPE # Simplified imports
from app.llm import LLM

class SimplifiedAgent:
    def __init__(self):
        self.memory = Memory() # Agent holds a Memory instance
        self.llm = LLM() # Agent has access to the LLM

    def add_user_input(self, text: str):
        """Adds user input to memory."""
        user_msg = Message.user_message(text)
        self.memory.add_message(user_msg)
        print(f"Agent Memory Updated with: {user_msg.to_dict()}")

    async def generate_response(self) -> str:
        """Generates a response based on memory."""
        print("Agent consulting memory...")
        messages_for_llm = self.memory.to_dict_list()

        print(f"Sending {len(messages_for_llm)} messages to LLM...")
        # The actual call to the LLM
        response_text = await self.llm.ask(messages=messages_for_llm)

        # Add assistant response to memory
        assistant_msg = Message.assistant_message(response_text)
        self.memory.add_message(assistant_msg)
        print(f"Agent Memory Updated with: {assistant_msg.to_dict()}")

        return response_text

# Example Usage (needs async context)
# agent = SimplifiedAgent()
# agent.add_user_input("What is the capital of France?")
# response = await agent.generate_response() # Gets "Paris"
# agent.add_user_input("What about Spain?")
# response2 = await agent.generate_response() # Gets "Madrid"

Explanation:

The agent has self.memory.
When input arrives (add_user_input), it creates a Message and adds it using self.memory.add_message.
When generating a response (generate_response), it retrieves the history using self.memory.to_dict_list() and passes it to self.llm.ask.
It then adds the LLM’s response back into memory as an assistant message.

Wrapping Up Chapter 2

You’ve now learned about Message (a single conversational turn with a role and content) and Memory (the ordered list storing these messages). Together, they provide the crucial context agents need to understand conversations and respond coherently. They act as the agent’s short-term memory or chat log.

We have the brain (LLM) and the memory (Message/Memory). Now we need something to orchestrate the process – to receive input, consult memory, use the LLM, potentially use tools, and manage its state. That’s the job of the Agent itself.

Let’s move on to Chapter 3: BaseAgent to see how agents are structured and how they use these core components.

Generated by AI Codebase Knowledge Builder