chapter nine

9 Optimizing cost and quality

This chapter covers

Model choice and tuning
Prompt engineering
Fine-tuning models

Analyzing data with large language models is a great way to burn money quickly. If you’ve been using GPT-4 (or a similarly large model) for a while, you’ve probably noticed how fees pile up quickly, forcing you to recharge your account regularly. But do we always need to use the largest (and most expensive) model? Can’t we make smaller models perform almost as well? How can we get the most bang for our buck?

This chapter is about saving money when using language models on large data sets. Fortunately, we have quite a few options for doing so. First, we have lots of choices when it comes to large language models. Selecting a model that is as small (or, rather, as cheap) as possible while still performing well on our analysis task can go a long way toward balancing our budget. Second, models typically have various tuning parameters, allowing us to tune everything from the overall text generation strategy to the way specific tokens are (de-)prioritized. We want to optimize our settings there to turn small models into GPT-4 alternatives for certain tasks. Third, we can use prompt engineering to tweak the way we ask the model our questions, sometimes leading to surprisingly different results!

9 Optimizing cost and quality

This chapter covers

9.1 Example scenario

9.2 Untuned classifier

9.3 Model tuning

9.4 Model selection

9.5 Prompt engineering

9.6 Tunable classifier

9.7 Fine-tuning

9.8 Generating training data

9.9 Starting a fine-tuning job

9.10 Using the fine-tuned model

Summary