Evaluations and Alignments cover
welcome to this free extract from
an online version of the Manning book.
to read more
or
welcome

welcome

 

Thank you for purchasing the MEAP for Evaluation and Alignment: The Seminal Papers.

When I first led teams building AI applications, I kept running into the same questions from team members and stakeholders: How do we know if this answer is correct? How do we measure hallucination? How do we ensure our AI system behaves the way we want? The answers were scattered across dense research papers written for academics, not practitioners.

This book makes that foundational knowledge accessible. Each chapter walks through a pivotal research paper, explaining its historical context, core innovation, and practical implications. You'll trace the evolution from early n-gram metrics like BLEU through modern approaches like LLM-as-a-Judge and Constitutional AI. More importantly, you'll understand why each technique was developed and where its limitations lie.

By the end, you'll be equipped to design your own evaluation metrics and alignment strategies, not just apply existing ones. To get the most from this book, you should have experience programming in Python. No prior exposure to NLP metrics, reinforcement learning, or alignment research is required.

Please post any questions or comments in the liveBook Discussion forum. Your feedback is essential in developing the best book possible.

— Han Lee