Chapter 2. The data science process

This chapter covers

Understanding the flow of a data science process
Discussing the steps in a data science process

The goal of this chapter is to give an overview of the data science process without diving into big data yet. You’ll learn how to work with big data sets, streaming data, and text data in subsequent chapters.

2.1. Overview of the data science process

Following a structured approach to data science helps you to maximize your chances of success in a data science project at the lowest cost. It also makes it possible to take up a project as a team, with each team member focusing on what they do best. Take care, however: this approach may not be suitable for every type of project or be the only way to do good data science.

The typical data science process consists of six steps through which you’ll iterate, as shown in figure 2.1.

Figure 2.1. The six steps of the data science process

Figure 2.1 summarizes the data science process and shows the main steps and actions you’ll take during a project. The following list is a short introduction; each of the steps will be discussed in greater depth throughout this chapter.

1. The first step of this process is setting a research goal. The main purpose here is making sure all the stakeholders understand the what, how, and why of the project. In every serious project this will result in a project charter.

Chapter 2. The data science process

This chapter covers

2.1. Overview of the data science process

Figure 2.1. The six steps of the data science process

2.2. Step 1: Defining research goals and creating a project charter

2.3. Step 2: Retrieving data

2.4. Step 3: Cleansing, integrating, and transforming data

2.5. Step 4: Exploratory data analysis

2.6. Step 5: Build the models

2.7. Step 6: Presenting findings and building applications on top of them

2.8. Summary

Chapter 2. The data science process

This chapter covers

2.1. Overview of the data science process

Figure 2.1. The six steps of the data science process

2.2. Step 1: Defining research goals and creating a project charter

2.3. Step 2: Retrieving data

2.4. Step 3: Cleansing, integrating, and transforming data

2.5. Step 4: Exploratory data analysis

2.6. Step 5: Build the models

2.7. Step 6: Presenting findings and building applications on top of them

2.8. Summary

Unable to load book!