3 Introducing Metaflow
This chapter covers
- Defining a workflow in Metaflow that accepts input data and produces useful outputs.
- Optimizing performance of workflows with parallel computation on a single instance.
- Analyzing results of workflows in notebooks.
- Developing a simple end-to-end application in Metaflow.
You are probably anxious to roll up sleeves and start hacking some actual data science code, now that we have a productivity-boosting development environment set up! In this chapter, you will learn the basics of developing data science applications using Metaflow, a framework that shows how different layers of the infrastructure stack can work together seamlessly.
The development environment, which we discussed in the previous chapter, determines how the data scientist develops applications: by writing code in an editor, evaluating it in a terminal, and analyzing results in a notebook. On top of this toolchain, the data scientist uses Metaflow to determine what code gets written and why, which is the topic of this chapter. The next chapters will then cover the infrastructure that determines where and when the workflows get executed.