chapter five

5 Optimizing prompt examples

This chapter covers

How DSPy optimizes programs
Optimizing the set of demonstrations included in a prompt
DSPy’s bootstrapping process
The LabeledFewShot, BootstrapFewShot, BootstrapFewShotWithRandomSearch, and KNN optimizers

As we’ve discussed throughout the book, optimizing prompts is one of the most important parts of prompt programming, so let’s now look at how this is done in DSPy. If you’re familiar with training machine learning models, DSPy optimization works in much the same way. In particular, it's also a data-driven process. When training a machine learning model, we may evaluate many different ways to create a model and then select the one that appears to work the best – the model that maximizes (or minimizes) the relevant metric on the validation set. This general approach has been a major influence on prompt programming and is the basis of DSPy’s optimization methods.

With prompt programming, we seek to find the prompt that maximizes the specified metric function for our validation data set. As with machine learning, this is an automated and a data-driven approach, where DSPy generates many candidate prompts and methodically tests them with the validation set against the LM. It’s impossible to evaluate the infinite number of possible prompts, so we need to generate just a reasonable number and carefully evaluate each one. We’ll go over how this is done with DSPy in this chapter and the next.

5.1 General approaches to optimization

5.2 The LabeledFewShot optimizer

5.2.1 Executing the LabeledFewShot optimizer

5.2.2 Executing the LabeledFewShot optimizer repeatedly in a loop

5.2.3 Working with ChainOfThought

5.3 BootstrapFewShot

5.3.1 Working with ChainOfThought

5.3.2 Specifying a teacher

5.4 BootstrapFewShotWithRandomSearch

5.5 KNN: Finding Examples Dynamically

5.6 Evaluation Results

5.6.1 gpt-4o-mini

5.6.2 GPT-4.1-nano

5.7 Summary