chapter four

4 Real-World Data Representation Using Tensors

 

This chapter covers:

  • Representing different types of real-world data as PyTorch tensors
  • Working with range of data types, including spread sheet, time series, text, image, and medical imaging
  • Loading data from file
  • Converting data to tensors
  • Shaping tensors so they can be used as inputs for neural network models

In the previous chapter, you learned that tensors are the building blocks for data in PyTorch. Neural networks take tensors in input and produce tensors as outputs. In fact, all operations within a neural network and during optimization are operations between tensors, and all parameters (e.g. weights and biases) in a neural network are tensors. Having a good sense of how to perform operations on tensors and index them effectively is central to using tools like PyTorch successfully.Now that you know the basics of tensors, your dexterity with them will grow as you make your way through the book.

There’s a question that we can already address at this point: how do we take a piece of data, a video, or text, and represent it with a tensor? And do that in a way that is appropriate for training a deep learning model?

4.1  Images

4.2  Volumetric Data

4.3  Tabular Data

4.4  Time Series

4.5  Text

4.5.1  Text embeddings

4.5.2  Text embeddings a a blueprint

4.6  Conclusion

4.7  Exercises

4.8  Summary