chapter five

Chapter 5. What is an ML pipeline, and how does it affect an AI project?

This chapter covers

Understanding an ML pipeline
Understanding why an ML pipeline ossifies and how to address that
Understanding the evolution of ML or AI algorithms in larger systems
Balancing attention between business questions, data, and AI algorithms

In previous chapters, you learned how to select your initial AI project and how to tie technology metrics with business results. Now it’s time to understand how to guide the development of the software architecture of the AI project. This chapter teaches you to recognize in which respects the AI system behaves differently from other software systems. To effectively implement an AI project, it’s important to understand the technical artifacts and the life cycle of an AI project. I’ll start by explaining the most important artifact of the AI project—the ML pipeline.

The ML pipeline describes how data flows through the system, what high-level transformation is done on it, which ML and AI algorithms are applied, and how the results are presented to the user of your AI system. Without a focus on the pipeline’s architecture, your project would be saddled with an ML pipeline that emerges from early proof-of-concept (POC) decisions. That’s not good, because an ML pipeline quickly ossifies and becomes difficult and costly to radically change.

Chapter 5. What is an ML pipeline, and how does it affect an AI project?

5.1. How is an AI project different?

5.2. Why we need to analyze the ML pipeline

5.3. What’s the role of AI methods?

5.4. Balancing data, AI methods, and infrastructure

5.5. Exercises

Summary