chapter seven

7 Working with the People Annotating your Data

 

This chapter covers

  • Understanding the characteristics of in-house, contracted, and pay-per-task annotation workforces.
  • Motivating different workforces using three key principles.
  • Evaluating workforces when compensation is non-monetary.
  • Evaluating your annotation volume requirements.
  • Understanding the training and/or expertise that annotators need for a given task.

In the last few chapters of the book, you learned how to select the right data for human review. Now, the following chapters will cover how to optimize that human interaction. Machine Learning models often require thousands (and sometimes millions) of instances of human feedback in order to get the training data necessary to be accurate.

The type of workforce you need will depend on your task, scale, and urgency. If you have a simple task, like identifying whether a social media posting is positive or negative sentiment and you need millions of human annotations as soon as possible, then your ideal workforce doesn’t need specialized skills. But ideally, that workforce can scale to thousands of people in parallel and each person can be employed for short amounts of time.

7.1    Introduction to annotation

7.1.1   Three principles for good data annotation

7.1.2   Annotating data and reviewing model predictions

7.1.3   Annotations from Machine Learning assisted humans

7.2    In-house experts 

7.2.1   Salary for in-house workers

7.2.2   Security for in-house workers

7.2.3   Ownership for in-house workers

7.2.4   Tip: Always run in-house annotation sessions

7.3    Outsourced workers

7.3.1   Salary for outsourced workers

7.3.2   Security for outsourced workers

7.3.3   Ownership for outsourced workers

7.3.4   Tip: Talk to your outsourced workers

7.4    Crowdsourced workers

7.4.1   Salary for crowdsourced workers

7.4.2   Security for crowdsourced workers

7.4.3   Ownership for crowdsourced workers

7.5    Other workforces