list-of-listings

List of Listings

Chapter 2. Searching

Listing 2.1. Reading, indexing, and searching the default list of web pages

Figure 2.1. An example of retrieving, parsing, analyzing, indexing, and searching a set of web pages with a few lines of code

Listing 2.2. The LuceneIndexBuilder creates a Lucene index

Listing 2.3. MySearcher: retrieving search results based on Lucene indexing

Listing 2.4. Reading, indexing, and searching web pages that contain spam

Figure 2.4. A single deceptive web page significantly altered the ranking of the results for the query “Armstrong.”

Listing 2.5. Calculating the PageRank vector

Figure 2.6. The calculation of the PageRank vector for the small network of the business news web pages

Listing 2.6. Evaluating the matrix H based on the links between web pages

Listing 2.7. Applying the power method for the calculation of PageRank

Listing 2.8. Evaluation of the error between two consecutive PageRank vectors

Listing 2.9. Combining the Lucene and PageRank scores for ranking web pages

Figure 2.7. Combining the Lucene scores and the PageRank scores allows you to eliminate spam.

Listing 2.10. Combining the Lucene scores and the PageRank scores

Listing 2.11. Accounting for user clicks in the search results

Listing 2.12. Evaluating the relevance of a URL with the NaiveBayes classifier

Listing 2.13. Lucene indexing, PageRank values, and user click probabilities

Figure 2.8. Combining Lucene, PageRank, and user clicks to produce high-relevance search results for dmitry.

@font-face { font-family: 'livebook'; src:url('https://d19npu3b8zepp3.cloudfront.net/assets/fonts/livebook.eot?1.9.0'); src:url('https://d19npu3b8zepp3.cloudfront.net/assets/fonts/livebook.eot?1.9.0') format('embedded-opentype'), url('https://d19npu3b8zepp3.cloudfront.net/assets/fonts/livebook.woff?1.9.0') format('woff'), url('https://d19npu3b8zepp3.cloudfront.net/assets/fonts/livebook.ttf?1.9.0') format('truetype'), url('https://d19npu3b8zepp3.cloudfront.net/assets/fonts/livebook.svg?1.9.0') format('svg'); font-weight: normal; font-style: normal; }