5 Training large language models: How to generate the generator

 

This chapter covers

  • Setting up a training environment and common libraries
  • Applying various training techniques, including using advanced methodologies
  • Tips and tricks to get the most out of training
Be water, my friend.
—Bruce Lee

Are you ready to have some fun?! What do you mean the last four chapters weren’t fun? Well, I promise this one for sure will be. We’ve leveled up a lot and gained a ton of context that will prove invaluable now as we start to get our hands dirty. By training an LLM, we can create bots that can do amazing things and have unique personalities. Indeed, we can create new friends and play with them. In the last chapter, we showed you how to create a training dataset based on your Slack messages. Now we will show you how to take that dataset and create a persona of yourself. Finally, you will no longer have to talk to that one annoying coworker, and just like Gilfoyle, you can have your own AI Gilfoyle (https://youtu.be/IWIusSdn1e4).

5.1 Multi-GPU environments

5.1.1 Setting up

5.1.2 Libraries

5.2 Basic training techniques

5.2.1 From scratch

5.2.2 Transfer learning (finetuning)

5.2.3 Prompting

5.3 Advanced training techniques

5.3.1 Prompt tuning

5.3.2 Finetuning with knowledge distillation

5.3.3 Reinforcement learning with human feedback

5.3.4 Mixture of experts

5.3.5 LoRA and PEFT

5.4 Training tips and tricks

5.4.1 Training data size notes

5.4.2 Efficient training

5.4.3 Local minima traps

5.4.4 Hyperparameter tuning tips

5.4.5 A note on operating systems

5.4.6 Activation function advice

Summary