Chapter 1. Getting started taming text
Figure 1.1. A simple workflow for answering questions posed to a QA system
Chapter 2. Foundations of taming text
Figure 2.1. Sample parsing of a sentence using the OpenNLP parser
Chapter 3. Searching
Figure 3.1. Snippet of search results for the query “LCD TV” from Amazon.com. Image captured 9/2/2012.
Figure 3.2. Facets for the search term “LCD TV” after choosing the Electronics facet. Image captured 9/2/2012.
Figure 3.3. Search results for “LCD TV” narrowed down by several facets. Captured 9/2/12
Figure 3.4. The inverted index data structure maps terms to the documents they occur in, enabling fast lookup of query terms in a search engine. The left side represents a sampling of the vocabulary in the documents and the right side represents the documents. The inverted index tracks where terms occur in documents.
Figure 3.5. http://search.yahoo.com presents a simple user interface to search users.
Figure 3.6. http://search.yahoo.com has an advanced search entry screen (under the More link) to allow users to fine-tune their results.
Figure 3.7. Google Canada’s Advanced Search UI automatically builds complex phrase and Boolean queries without requiring the user to know specific reserved words like AND, OR, NOT or quoting phrases.
Figure 3.8. An example of the vector space model for a document containing two words: hockey and cycling
Figure 3.9. Two documents represented as vectors in a 10-dimensional vector space