7 Working with the people annotating your data

 

This chapter covers

  • Understanding in-house, contracted, and pay-per-task annotation workforces
  • Motivating different workforces using three key principles
  • Evaluating workforces when compensation is nonmonetary
  • Evaluating your annotation volume requirements
  • Understanding the training and/or expertise that annotators need for a given task

In the first two parts of this book, you learned how to select the right data for human review. The chapters in this part cover how to optimize that human interaction, starting with how to find and manage the right people to provide human feedback. Machine learning models often require thousands (and sometimes millions) of instances of human feedback to get the training data necessary to be accurate.

The type of workforce you need will depend on your task, scale, and urgency. If you have a simple task, such as identifying whether a social media posting is positive or negative sentiment, and you need millions of human annotations as soon as possible, your ideal workforce doesn’t need specialized skills. But ideally, that workforce can scale to thousands of people in parallel, and each person can be employed for short amounts of time.

7.1 Introduction to annotation

7.1.1 Three principles of good data annotation

7.1.2 Annotating data and reviewing model predictions

7.1.3 Annotations from machine learning-assisted humans

7.2 In-house experts

7.2.1 Salary for in-house workers

7.2.2 Security for in-house workers

7.2.3 Ownership for in-house workers

7.2.4 Tip: Always run in-house annotation sessions

7.3 Outsourced workers

7.5 Other workforces