Chapter 1: Graph / StateGraph - The Blueprint of Your Application
Welcome to the LangGraph tutorial! We’re excited to help you learn how to build powerful, stateful applications with Large Language Models (LLMs).
Imagine you’re building an application, maybe a chatbot, an agent that performs tasks, or something that processes data in multiple steps. As these applications get more complex, just calling an LLM once isn’t enough. You need a way to structure the flow – maybe call an LLM, then a tool, then another LLM based on the result. How do you manage this sequence of steps and the information passed between them?
That’s where Graphs come in!
What Problem Do Graphs Solve?
Think of a complex task like baking a cake. You don’t just throw all the ingredients in the oven. There’s a sequence: mix dry ingredients, mix wet ingredients, combine them, pour into a pan, bake, cool, frost. Each step depends on the previous one.
LangGraph helps you define these steps and the order they should happen in. It provides a way to create a flowchart or a blueprint for your application’s logic.
The core idea is to break down your application into:
Nodes: These are the individual steps or actions (like “mix dry ingredients” or “call the LLM”).
Edges: These are the connections or transitions between the steps, defining the order (after mixing dry ingredients, mix wet ingredients).
LangGraph provides different types of graphs, but the most common and useful one for building stateful applications is the StateGraph.
Core Concepts: Graph, StateGraph, and MessageGraph
Let’s look at the main types of graphs you’ll encounter:
Graph (The Basic Blueprint)
This is the most fundamental type. You define nodes (steps) and edges (connections).
It’s like a basic flowchart diagram.
You explicitly define how information passes from one node to the next.
While foundational, you’ll often use the more specialized StateGraph for convenience.
# This is a conceptual example - we usually use StateGraph
fromlanggraph.graphimportGraph# Define simple functions or Runnables as nodes
defstep_one(input_data):print("Running Step 1")returninput_data*2defstep_two(processed_data):print("Running Step 2")returnprocessed_data+5# Create a basic graph
basic_graph_builder=Graph()# Add nodes
basic_graph_builder.add_node("A",step_one)basic_graph_builder.add_node("B",step_two)# Add edges (connections)
basic_graph_builder.add_edge("A","B")# Run B after A
basic_graph_builder.set_entry_point("A")# Start at A
# basic_graph_builder.set_finish_point("B") # Not needed for this simple Graph type
StateGraph (The Collaborative Whiteboard)
This is the workhorse for most LangGraph applications. It’s a specialized Graph.
Key Idea: Nodes communicate implicitly by reading from and writing to a shared State object.
Analogy: Imagine a central whiteboard (the State). Each node (person) can read what’s on the whiteboard, do some work, and then update the whiteboard with new information or changes.
You define the structure of this shared state first (e.g., what keys it holds).
Each node receives the current state and returns a dictionary containing only the parts of the state it wants to update. LangGraph handles merging these updates into the main state.
MessageGraph (The Chatbot Specialist)
This is a further specialization of StateGraph, designed specifically for building chatbots or conversational agents.
It automatically manages a messages list within its state.
Nodes typically take the current list of messages and return new messages to be added.
It uses a special function (add_messages) to append messages while handling potential duplicates or updates based on message IDs. This makes building chat flows much simpler.
For the rest of this chapter, we’ll focus on StateGraph as it introduces the core concepts most clearly.
Building a Simple StateGraph
Let’s build a tiny application that takes a number, adds 1 to it, and then multiplies it by 2.
Step 1: Define the State
First, we define the “whiteboard” – the structure of the data our graph will work with. We use Python’s TypedDict for this.
fromtypingimportTypedDictclassMyState(TypedDict):# Our state will hold a single number called 'value'
value:int
This tells our StateGraph that the shared information will always contain an integer named value.
Step 2: Define the Nodes
Nodes are functions (or LangChain Runnables) that perform the work. They take the current State as input and return a dictionary containing the updates to the state.
# Node 1: Adds 1 to the value
defadd_one(state:MyState)->dict:print("--- Running Adder Node ---")current_value=state['value']new_value=current_value+1print(f"Input value: {current_value}, Output value: {new_value}")# Return *only* the key we want to update
return{"value":new_value}# Node 2: Multiplies the value by 2
defmultiply_by_two(state:MyState)->dict:print("--- Running Multiplier Node ---")current_value=state['value']new_value=current_value*2print(f"Input value: {current_value}, Output value: {new_value}")# Return the update
return{"value":new_value}
Notice how each function takes state and returns a dict specifying which part of the state ("value") should be updated and with what new value.
Step 3: Create the Graph and Add Nodes/Edges
Now we assemble our blueprint using StateGraph.
fromlanggraph.graphimportStateGraph,END,START# Create a StateGraph instance linked to our state definition
workflow=StateGraph(MyState)# Add the nodes to the graph
workflow.add_node("adder",add_one)workflow.add_node("multiplier",multiply_by_two)# Set the entry point --> where does the flow start?
workflow.set_entry_point("adder")# Add edges --> how do the nodes connect?
workflow.add_edge("adder","multiplier")# After adder, run multiplier
# Set the finish point --> where does the flow end?
# We use the special identifier END
workflow.add_edge("multiplier",END)
StateGraph(MyState): Creates the graph, telling it to use our MyState structure.
add_node("name", function): Registers our functions as steps in the graph with unique names.
set_entry_point("adder"): Specifies that the adder node should run first. This implicitly creates an edge from a special START point to adder.
add_edge("adder", "multiplier"): Creates a connection. After adder finishes, multiplier will run.
add_edge("multiplier", END): Specifies that after multiplier finishes, the graph execution should stop. END is a special marker for the graph’s conclusion.
Step 4: Compile the Graph
Before we can run it, we need to compile the graph. This finalizes the structure and makes it executable.
# Compile the workflow into an executable object
app=workflow.compile()
Step 5: Run It!
Now we can invoke our compiled graph (app) with some initial state.
# Define the initial state
initial_state={"value":5}# Run the graph
final_state=app.invoke(initial_state)# Print the final result
print("\n--- Final State ---")print(final_state)
As you can see, the graph executed the nodes in the defined order (adder then multiplier), automatically passing the updated state between them!
How Does StateGraph Work Under the Hood?
You defined the nodes and edges, but what actually happens when you call invoke()?
Initialization: LangGraph takes your initial input ({"value": 5}) and puts it onto the “whiteboard” (the internal state).
Execution Engine: A powerful internal component called the Pregel Execution Engine takes over. It looks at the current state and the graph structure.
Following Edges: It starts at the START node and follows the edge to the entry point (adder).
Node Execution: It runs the adder function, passing it the current state ({"value": 5}).
State Update: The adder function returns {"value": 6}. The Pregel engine uses special mechanisms called Channels to update the value associated with the "value" key on the “whiteboard”. The state is now {"value": 6}.
Next Step: The engine sees the edge from adder to multiplier.
Node Execution: It runs the multiplier function, passing it the updated state ({"value": 6}).
State Update:multiplier returns {"value": 12}. The engine updates the state again via the Channels. The state is now {"value": 12}.
Following Edges: The engine sees the edge from multiplier to END.
Finish: Reaching END signals the execution is complete. The final state ({"value": 12}) is returned.
Here’s a simplified visual:
Don’t worry too much about the details of Pregel or Channels yet – we’ll cover them in later chapters. The key takeaway is that StateGraph manages the state and orchestrates the execution based on your defined nodes and edges.
A Peek at the Code (graph/state.py, graph/graph.py)
Let’s briefly look at the code snippets provided to see how these concepts map to the implementation:
StateGraph.__init__ (graph/state.py):
# Simplified view
classStateGraph(Graph):def__init__(self,state_schema:Optional[Type[Any]]=None,...):super().__init__()# ... stores the state_schema ...
self.schema=state_schema# ... analyzes the schema to understand state keys and how to update them ...
self._add_schema(state_schema)# ... sets up internal dictionaries for channels, nodes etc. ...
This code initializes the graph, crucially storing the state_schema you provide. It analyzes this schema to figure out the “keys” on your whiteboard (like "value") and sets up the internal structures (Channels) needed to manage updates to each key.
StateGraph.add_node (graph/state.py):
# Simplified view
defadd_node(self,node:str,action:RunnableLike,...):# ... basic checks for name conflicts, reserved names (START, END) ...
ifnodeinself.channels:# Cannot use a state key name as a node name
raiseValueError(...)# ... wrap the provided action (function/runnable) ...
runnable=coerce_to_runnable(action,...)# ... store the node details (runnable, input type etc.) ...
self.nodes[node]=StateNodeSpec(runnable,...,input=inputorself.schema,...)returnself
When you add a node, it stores the associated function (action) and links it to the provided node name. It also figures out what input schema the node expects (usually the main graph state schema).
Graph.add_edge (graph/graph.py):
# Simplified view from the base Graph class
defadd_edge(self,start_key:str,end_key:str):# ... checks for invalid edges (e.g., starting from END) ...
# ... basic validation ...
# Stores the connection as a simple pair
self.edges.add((start_key,end_key))returnself
Adding an edge is relatively simple – it just records the (start_key, end_key) pair in a set, representing the connection.
StateGraph.compile (graph/state.py):
# Simplified view
defcompile(self,...):# ... validation checks ...
self.validate(...)# ... create the CompiledStateGraph instance ...
compiled=CompiledStateGraph(builder=self,...)# ... add nodes, edges, branches to the compiled version ...
forkey,nodeinself.nodes.items():compiled.attach_node(key,node)forstart,endinself.edges:compiled.attach_edge(start,end)# ... more setup for branches, entry/exit points ...
# ... finalize and return the compiled graph ...
returncompiled.validate()
Compilation takes your defined nodes and edges and builds the final, executable CompiledStateGraph. It sets up the internal machinery (Pregel, Channels) based on your blueprint.
Conclusion
You’ve learned the fundamental concept in LangGraph: the Graph.
Graphs define the structure and flow of your application using Nodes (steps) and Edges (connections).
StateGraph is the most common type, where nodes communicate implicitly by reading and updating a shared State object (like a whiteboard).
MessageGraph is a specialized StateGraph for easily building chatbots.
You define the state structure, write node functions that update parts of the state, connect them with edges, and compile the graph to make it runnable.
Now that you understand how to define the overall structure of your application using StateGraph, the next step is to dive deeper into what constitutes a Node.