7 Categorical data: Advanced methods

 

This chapter covers

  • Combining continuous and categorical data in an analysis
  • Converting continuous data to categorical when appropriate to do so
  • Analyzing categorical data with advanced methods, such as statistical tests

In this chapter, we will continue exploring the value of categorical data. In chapter 6, we explored survey data, which was mostly categorical, and answered some of our stakeholders’ questions with methods appropriate for categorical data. As a reminder, the data and example solution files for this project are available at https://davidasboth.com/book-code.

This chapter dives into more advanced methods: performing statistical tests with categorical data and combining continuous and categorical data. We will first recap the project brief from the previous chapter and summarize the work done so far before continuing with the analysis.

7.1 Project 5 revisited: Analyzing survey data to determine developer attitudes to AI tools

To recap, we are analyzing the Stack Overflow Developer Survey to determine how coders are using AI tools. Our stakeholders are interested in testing two of their hypotheses:

  • New and experienced coders are using these tools differently.
  • People’s opinions on the usefulness and trustworthiness of current AI tools depend on their experience, job role, and what specifically they use the tools for.

7.1.1 Data dictionary

7.1.2 Desired outcomes

7.1.3 Summary of the project so far

7.2 Using advanced methods to answer further questions about categorical data

7.2.1 Binning continuous values to discrete categories

7.2.2 Using statistical tests for categorical data

7.2.3 Answering a new question from start to finish

7.2.4 Project results

7.3 Closing thoughts on categorical data

7.3.1 Skills for working with categorical data for any project

Summary