This chapter covers
- Building and launching images for Kubeflow notebooks
- Using Kubeflow notebooks for data analysis
- Passing data in Kubeflow Pipelines
- Writing Kubeflow components that pass data
- Developing the data preparation pipeline for object detection
The landscape of machine learning (ML) is ever-evolving, with new developments surfacing every other week. During the era when deep learning took center stage, innovations such as new versions of You Only Look Once (YOLO) and ResNet became the talk of the town. Nowadays (at least at the we wrote this), large language models (LLMs) and visual language models (VLMs) have taken center stage for their performance and wide applications.
While there are constantly new architectures and techniques that capture the limelight, the success of these techniques often lie with arguably the least sexy but the most important part of ML: data preparation. “Garbage in, garbage out” isn’t just a line that grumpy ML engineers mutter. Rather, it captures the fundamental truth that the quality and integrity of your input data ultimately shapes the reliability and efficacy of your ML model and results.