This chapter covers
- Non-statistical software that can help you do statistics more efficiently
- Some popular and ubiquitous software concepts related to analytic software
- Basic guidelines for using supplementary software
Figure 9.1 shows where we are in the data science process: optimizing a product with supplementary software. The software tools covered in chapter 8 can be very versatile, but there I focused mainly on the statistical nature of each. Software can do much more than statistics. In particular, many tools are available that are designed to store, manage, and move data efficiently. Some can make almost every aspect of calculation and analysis faster and easier to manage. In this chapter I’ll introduce some of the most popular and most beneficial software for making your life and work as a data scientist easier.
Figure 9.1. An important aspect of the build phase of the data science process: using supplementary software to optimize the product

I discussed the concept of a database in chapter 3 as one form of data source. Databases are common, and your chances of running across one during a project are fairly high, particularly if you’re going to be using data that’s used by others quite often. But instead of merely running into one as a matter of course, it might be worthwhile to set up a database yourself to aid you in your project.