Chapter 8. Essential Lucene extensions

 

This chapter covers

  • Highlighting hits in your search results
  • Correcting the spelling of search text
  • Viewing index details using Luke
  • Using additional query, analyzer, and filter implementations

You’ve built an index, but can you browse or query it without writing code? Absolutely! In this chapter, we’ll show you Luke, a useful tool that does just that. Do you need analysis beyond what the built-in analyzers provide? Several specialized analyzers for many languages are available in Lucene’s contrib modules. How about providing term highlighting in search results? We’ve got two choices for that! We’ll also show you how to offer suggestions for misspelled words.

This chapter examines the essential, most commonly used Lucene extensions, most of which are housed in the contrib subdirectory within Lucene’s source code. Deliberate care was taken with the design of Lucene to keep the core source code cohesive yet extensible. We’re taking the same care in this book by keeping an intentional separation between what’s in the core of Lucene and the extensions packages that have been developed to augment it.

There are so many interesting packages that we’ve divided our coverage into two chapters. In this chapter we’ll cover the more frequently used packages, and in the next chapter we’ll describe the less popular, yet still interesting, long tail. The benchmark module is so useful we dedicate a separate appendix (C) to it.

8.1. Luke, the Lucene Index Toolbox

8.2. Analyzers, tokenizers, and TokenFilters

8.3. Highlighting query terms

8.4. FastVectorHighlighter

8.5. Spell checking

8.6. Fun and interesting Query extensions

8.7. Building contrib modules

8.8. Summary