10 Annotation Quality for Different Machine Learning Tasks
This chapter covers:
- Adapting annotation quality control methods from labeling to continuous tasks
- Managing annotation quality for Computer Vision tasks like object detection and semantic segmentation.
- Managing annotation quality for Natural Language Processing tasks like Sequence Labeling and Text Generation.
- Understanding annotation quality for other Machine Learning tasks in Speech, Video and Information Retrieval.
Most machine learning tasks are more complicated than labeling an entire image or document. Imagine that you need to generate subtitles for movies in a creative way. Creating transcriptions of spoken and signed language is a language generation task. If you wanted to emphasize angry language with bold text, then this is an additional sequence labeling task. Imagine also that you want to display the transcriptions like the “speech bubbles” of text found in comics. You could use object detection to make sure that the speech bubble comes from the right person and use semantic segmentation to ensure that the speech bubble is placed over the background of the scene instead of people or important objects. You might also want to predict what a given person might rate the film as part of a recommendation system or feed the content into a search engine that can find matches for abstract phrases like “motivational speeches”.