6 Evaluation and Metrics for Generative Models
This chapter covers
- Qualitative and quantitative evaluation methods, including visual inspection, user studies, and automated metrics like Inception Score and Fréchet Inception Distance
- Model-specific evaluation techniques for VAEs, GANs, and Diffusion Models, addressing their unique characteristics and potential failure modes
- Task-specific evaluation metrics and their application in real-world scenarios such as medical image synthesis and urban planning
- Challenges and limitations in current evaluation practices, including issues of bias, computational complexity, and the lack of ground truth in generative tasks
This chapter provides a comprehensive survey of evaluation techniques and metrics for Generative AI models in computer vision. We will outline key approaches, from established methods to cutting-edge methodologies. For each technique, we will discuss its strengths, limitations, and typical use cases. By the end of this chapter, you will have a clear overview of the evaluation landscape, enabling you to select appropriate metrics for various generative model scenarios in research and practical applications.