8 Training and evaluating large language models
This chapter covers
- Deep dive into hyperparameters
- Hyperparameter optimization with Ray
- Effective strategies for experiment tracking
- Parameter efficient fine-tuning
- Various quantization techniques
Large language models have transformed how we approach tasks ranging from translation to content generation. However, their size brings unique challenges that require efficient strategies for training, tuning, and evaluation.
This chapter offers a practical overview of the most effective tools and techniques for improving the efficiency and manageability of large models throughout development and deployment. We begin by exploring hyperparameters and their impact on model performance, followed by optimization strategies such as pruning, distillation, quantization, and sharding.
To support large-scale experimentation, Ray and Weights & Biases are widely adopted in modern machine learning workflows. Ray provides a scalable framework for distributed training and hyperparameter optimization, with native integration into major cloud providers like AWS and GCP. Weights & Biases complements this with comprehensive tools for experiment tracking, model monitoring, and result visualization. Used together, they enable more structured and efficient development cycles.