7 Fine-tuning to follow instructions

 

This chapter covers

  • The instruction fine-tuning process of LLMs
  • Preparing a dataset for supervised instruction fine-tuning
  • Organizing instruction data in training batches
  • Loading a pretrained LLM and fine-tuning it to follow human instructions
  • Extracting LLM-generated instruction responses for evaluation
  • Evaluating an instruction-fine-tuned LLM

Previously, we implemented the LLM architecture, carried out pretraining, and imported pretrained weights from external sources into our model. Then, we focused on fine-tuning our LLM for a specific classification task: distinguishing between spam and non-spam text messages. Now we’ll implement the process for fine-tuning an LLM to follow human instructions, as illustrated in figure 7.1. Instruction fine-tuning is one of the main techniques behind developing LLMs for chatbot applications, personal assistants, and other conversational tasks.

Figure 7.1 The three main stages of coding an LLM. This chapter focuses on step 9 of stage 3: fine-tuning a pretrained LLM to follow human instructions.
figure

Figure 7.1 shows two main ways of fine-tuning an LLM: fine-tuning for classification (step 8) and fine-tuning an LLM to follow instructions (step 9). We implemented step 8 in chapter 6. Now we will fine-tune an LLM using an instruction dataset.

7.1 Introduction to instruction fine-tuning

 
 
 
 

7.2 Preparing a dataset for supervised instruction fine-tuning

 
 
 

7.3 Organizing data into training batches

 
 

7.4 Creating data loaders for an instruction dataset

 
 
 

7.5 Loading a pretrained LLM

 
 
 
 

7.6 Fine-tuning the LLM on instruction data

 
 
 

7.7 Extracting and saving responses

 

7.8 Evaluating the fine-tuned LLM

 
 

7.9 Conclusions

 
 

7.9.1 What’s next?

 
 
 

7.9.2 Staying up to date in a fast-moving field

 
 

7.9.3 Final words

 
 

Summary

 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage