Preface

 

In 2012, an article in the Harvard Business Review named the role of data scientist “the sexiest job of the 21st century.” With 87 years left in the century, it’s fair to say they might yet change their minds. Nevertheless, at the moment, data scientists are getting a lot of attention, and as a result, books about data science are proliferating. There would be no sense in adding another book to the pile if it merely repeated or repackaged text that is easily found elsewhere. But, while surveying new data science literature, it became clear to me that most authors would rather explain how to use all the latest tools and technologies than discuss the nuanced problem-solving nature of the data science process. Armed with several books and the latest knowledge of algorithms and data stores, many aspiring data scientists were still asking the question: Where do I start?

And so, here is another book on data science. This one, however, attempts to lead you through the data science process as a path with many forks and potentially unknown destinations. The book warns you of what may be ahead, tells you how to prepare for it, and suggests how to react to surprises. It discusses what tools might be the most useful, and why, but the main objective is always to navigate the path—the data science process—intelligently, efficiently, and successfully, to arrive at practical solutions to real-life data-centric problems.