Chapter 2: Signatures - Defining the Task
In Chapter 1: Modules and Programs, we learned that Module
s are like Lego bricks that perform specific tasks, often using Language Models (LM). We saw how Program
s combine these modules.
But how does a Module
, especially one using an LM like dspy.Predict
, know exactly what job to do?
Imagine you ask a chef (our LM) to cook something. Just saying “cook” isn’t enough! You need to tell them:
- What ingredients to use (the inputs).
- What dish to make (the outputs).
- The recipe or instructions (how to make it).
This is precisely what a Signature
does in DSPy!
A Signature
acts like a clear recipe or contract for a DSPy Module
. It defines:
- Input Fields: What information the module needs to start its work.
- Output Fields: What information the module is expected to produce.
- Instructions: Natural language guidance (like a recipe!) telling the underlying LM how to transform the inputs into the outputs.
Think of it as specifying the ‘shape’ and ‘purpose’ of a module, making sure everyone (you, DSPy, and the LM) understands the task.
Why Do We Need Signatures?
Without a clear definition, how would a module like dspy.Predict
know what to ask the LM?
Let’s say we want a module to translate English text to French. We need to tell it:
- It needs an
english_sentence
as input. - It should produce a
french_sentence
as output. - The task is to translate the input sentence into French.
A Signature
bundles all this information together neatly.
Defining a Signature: The Recipe Card
The most common way to define a Signature is by creating a Python class that inherits from dspy.Signature
.
Let’s create our English-to-French translation signature:
import dspy
from dspy.signatures.field import InputField, OutputField
class TranslateToFrench(dspy.Signature):
"""Translates English text to French.""" # <-- These are the Instructions!
# Define the Input Field the module expects
english_sentence = dspy.InputField(desc="The original sentence in English")
# Define the Output Field the module should produce
french_sentence = dspy.OutputField(desc="The translated sentence in French")
Let’s break this down:
class TranslateToFrench(dspy.Signature):
: We declare a new class namedTranslateToFrench
that inherits fromdspy.Signature
. This tells DSPy it’s a signature definition."""Translates English text to French."""
: This is the docstring. It’s crucial! DSPy uses this docstring as the natural language Instructions for the LM. It tells the LM the goal of the task.english_sentence = dspy.InputField(...)
: We define an input field namedenglish_sentence
.dspy.InputField
marks this as required input. Thedesc
provides a helpful description (good for documentation and potentially useful for the LM later).french_sentence = dspy.OutputField(...)
: We define an output field namedfrench_sentence
.dspy.OutputField
marks this as the expected output. Thedesc
describes what this field should contain.
That’s it! We’ve created a reusable “recipe card” that clearly defines our translation task.
How Modules Use Signatures
Now, how does a Module
like dspy.Predict
use this TranslateToFrench
signature?
dspy.Predict
is a pre-built module designed to take a signature and use an LM to generate the output fields based on the input fields and instructions.
Here’s how you might use our signature with dspy.Predict
(we’ll cover dspy.Predict
in detail in Chapter 4):
# Assume 'lm' is a configured Language Model client (more in Chapter 5)
# lm = dspy.OpenAI(model='gpt-3.5-turbo')
# dspy.settings.configure(lm=lm)
# Create an instance of dspy.Predict, giving it our Signature
translator = dspy.Predict(TranslateToFrench)
# Call the predictor with the required input field
english = "Hello, how are you?"
result = translator(english_sentence=english)
# The result object will contain the output field defined in the signature
print(f"English: {english}")
# Assuming the LM works correctly, it might print:
# print(f"French: {result.french_sentence}") # => French: Bonjour, comment ça va?
In this (slightly simplified) example:
translator = dspy.Predict(TranslateToFrench)
: We create aPredict
module. Crucially, we pass ourTranslateToFrench
class itself to it.dspy.Predict
now knows the input/output fields and the instructions from the signature.result = translator(english_sentence=english)
: When we call thetranslator
, we provide the input data using the exact name defined in our signature (english_sentence
).result.french_sentence
:dspy.Predict
uses the LM, guided by the signature’s instructions and fields, to generate the output. It then returns an object where you can access the generated French text using the output field name (french_sentence
).
The Signature
acts as the bridge, ensuring the Predict
module knows its job specification.
How It Works Under the Hood (A Peek)
You don’t need to memorize this, but understanding the flow helps! When a module like dspy.Predict
uses a Signature
:
- Inspection: The module looks at the
Signature
class (TranslateToFrench
in our case). - Extract Info: It identifies the
InputField
s (english_sentence
),OutputField
s (french_sentence
), and theInstructions
(the docstring:"Translates English text to French."
). - Prompt Formatting: When you call the module (e.g.,
translator(english_sentence="Hello")
), it uses this information to build a prompt for the LM. This prompt typically includes:- The Instructions.
- Clearly labeled Input Fields and their values.
- Clearly labeled Output Fields (often just the names, indicating what the LM should generate).
- LM Call: The formatted prompt is sent to the configured LM.
- Parsing Output: The LM’s response is received. DSPy tries to parse this response to extract the values for the defined
OutputField
s (likefrench_sentence
). - Return Result: A structured result object containing the parsed outputs is returned.
Let’s visualize this flow:
sequenceDiagram
participant User
participant PredictModule as dspy.Predict(TranslateToFrench)
participant Signature as TranslateToFrench
participant LM as Language Model
User->>PredictModule: Call with english_sentence="Hello"
PredictModule->>Signature: Get Instructions, Input/Output Fields
Signature-->>PredictModule: Return structure ("Translates...", "english_sentence", "french_sentence")
PredictModule->>LM: Send formatted prompt (e.g., "Translate...\nEnglish: Hello\nFrench:")
LM-->>PredictModule: Return generated text (e.g., "Bonjour")
PredictModule->>Signature: Parse LM output into 'french_sentence' field
Signature-->>PredictModule: Return structured output {french_sentence: "Bonjour"}
PredictModule-->>User: Return structured output (Prediction object)
The core logic for defining signatures resides in:
dspy/signatures/signature.py
: Defines the baseSignature
class and the logic for handling instructions and fields.dspy/signatures/field.py
: DefinesInputField
andOutputField
.
Modules like dspy.Predict
(in dspy/predict/predict.py
) contain the code to read these Signatures and interact with LMs accordingly.
# Simplified view inside dspy/signatures/signature.py
from pydantic import BaseModel
from pydantic.fields import FieldInfo
# ... other imports ...
class SignatureMeta(type(BaseModel)):
# Metaclass magic to handle fields and docstring
def __new__(mcs, name, bases, namespace, **kwargs):
# ... logic to find fields, handle docstring ...
cls = super().__new__(mcs, name, bases, namespace, **kwargs)
cls.__doc__ = cls.__doc__ or _default_instructions(cls) # Default instructions if none provided
# ... logic to validate fields ...
return cls
@property
def instructions(cls) -> str:
# Retrieves the docstring as instructions
return inspect.cleandoc(getattr(cls, "__doc__", ""))
@property
def input_fields(cls) -> dict[str, FieldInfo]:
# Finds fields marked as input
return cls._get_fields_with_type("input")
@property
def output_fields(cls) -> dict[str, FieldInfo]:
# Finds fields marked as output
return cls._get_fields_with_type("output")
class Signature(BaseModel, metaclass=SignatureMeta):
# The base class you inherit from
pass
# Simplified view inside dspy/signatures/field.py
import pydantic
def InputField(**kwargs):
# Creates a Pydantic field marked as input for DSPy
return pydantic.Field(**move_kwargs(**kwargs, __dspy_field_type="input"))
def OutputField(**kwargs):
# Creates a Pydantic field marked as output for DSPy
return pydantic.Field(**move_kwargs(**kwargs, __dspy_field_type="output"))
The key takeaway is that the Signature
class structure (using InputField
, OutputField
, and the docstring) provides a standardized way for modules to understand the task specification.
Conclusion
You’ve now learned about Signatures
, the essential component for defining what a DSPy module should do!
- A
Signature
specifies the Inputs, Outputs, and Instructions for a task. - It acts like a contract or recipe card for modules, especially those using LMs.
- You typically define them by subclassing
dspy.Signature
, usingInputField
,OutputField
, and a descriptive docstring for instructions. - Modules like
dspy.Predict
use Signatures to understand the task and generate appropriate prompts for the LM.
Signatures bring clarity and structure to LM interactions. But how do we provide concrete examples to help the LM learn or perform better? That’s where Examples
come in!
Next: Chapter 3: Example
Generated by AI Codebase Knowledge Builder