Introducing Data Science: Big data, machine learning, and more, using Python tools cover
About this Book


I can only show you the door. You’re the one that has to walk through it.

Morpheus, The Matrix

Welcome to the book! When reading the table of contents, you probably noticed the diversity of the topics we’re about to cover. The goal of Introducing Data Science is to provide you with a little bit of everything—enough to get you started. Data science is a very wide field, so wide indeed that a book ten times the size of this one wouldn’t be able to cover it all. For each chapter, we picked a different aspect we find interesting. Some hard decisions had to be made to keep this book from collapsing your bookshelf!

We hope it serves as an entry point—your doorway into the exciting world of data science.


Chapters 1 and 2 offer the general theoretical background and framework necessary to understand the rest of this book:

  • Chapter 1 is an introduction to data science and big data, ending with a practical example of Hadoop.
  • Chapter 2 is all about the data science process, covering the steps present in almost every data science project.

In chapters 3 through 5, we apply machine learning on increasingly large data sets:

