10 Testing in Python

 

This chapter covers

  • Importance of testing in data science
  • Automated testing
  • PyTest testing framework

In Chapter 9 we covered two key methods for productionalizing your code – creating an offline pipeline, and wrapping code inside of an Application Programming Interface (API). Another key aspect of putting code into production is to create tests to make sure that your code works properly as intended. For example, suppose you train a model locally on your machine, but when the model is in production, it could be living on a remote server. It is important to perform checks to make sure the model outputs the same predictions for the same inputs regardless of where the model file itself lives. Figure 10.1 shows an example of this task with an emphasis on two main points:

  • When building a model or prototype pipeline, your code may be located on a development server or a local environment
  • Putting your code in production often means making sure your code runs successfully in a production environment on a different server than the one you use for development purposes. To ensure your code continues to run successfully when moved from development to production, it is critical to execute tests to check for potential errors or points of failure

10.1 Overview of testing for data science

10.1.1 Common types of tests

10.2 The unittest library

10.3 PyTest

10.3.1 Running subsets of tests using PyTest

10.4 Test coverage

10.5 Practice on your own

10.6 Summary