This chapter covers
- Launching and using the pyspark shell for interactive development
- Reading and ingesting data into a data frame
- Exploring data using the DataFrame structure
- Selecting columns using the select() method
- Reshaping single-nested data into distinct records using explode()
- Applying simple functions to your columns to modify the data they contain
- Filtering columns using the where() method