8 Getting started with deep learning with tabular data

This chapter covers

  • An introduction to deep learning with tabular data stacks—low-level frameworks and high-level APIs for deep learning
  • The PyTorch with fastai stack
  • The PyTorch with TabNet stack
  • The PyTorch with Lightning Flash stack
  • The stacks we didn’t exercise and why we didn’t exercise them
  • A comparison of the pros and cons of deep learning with tabular data stacks

Up to this point, we have focused on classical machine learning tools and algorithms to analyze tabular data. Ranging from traditional regression algorithms to more sophisticated gradient boosting techniques, these approaches offer advantages in simplicity, transparency, and efficacy. That said, deep learning tools have become much easier to access and use, and they also provide a powerful alternative for handling tabular data.

In this chapter, we will review a set of deep learning stacks (low-level framework, high-level API, and deep learning for tabular data library) and use three of these stacks—fastai, PyTorch with TabNet, and Lightning Flash—to solve the Airbnb NYC problem. We’ll work the same problem three times, once with each stack. The goal is to illustrate both the general form of the deep learning approach and to highlight the unique characteristics of the three tools we’ve selected.

8.1 The deep learning with tabular data stack

8.2 PyTorch with fastai

8.2.1 Reviewing the key code aspects of the fastai solution

8.2.2 Comparing the fastai solution with the Keras solution

8.3 PyTorch with TabNet

8.3.1 Key code aspects of the TabNet solution

8.3.2 Comparing the TabNet solution with the Keras solution

8.4 PyTorch with Lightning Flash

8.4.1 The key code aspects of the Lightning Flash solution

8.4.2 Comparing the Lightning Flash solution with the Keras solution

8.5 Overall comparison of the stacks

8.6 The stacks we didn’t explore

Summary