Chapter 9. Further Lucene extensions

 

This chapter covers

  • Searching indexes remotely using RMI
  • Chaining multiple filters into one
  • Storing an index in Berkeley DB
  • Sorting and filtering according to geographic distance

In the previous chapter we explored a number of commonly used extensions to Lucene. In this chapter we’ll round out that coverage by detailing some of the less popular yet still interesting and useful extensions.

ChainedFilter lets you logically chain multiple filters together into one Filter. The Berkeley DB package enables storing a Lucene index within a Berkeley database. There are two options for storing an index entirely in memory, which provide far faster search performance than RAMDirectory. We’ll show three alternative QueryParser implementations, one based on XML, another designed to produce SpanQuery instances (something the core QueryParser can’t do), and a final new query parser that’s very modular. Spatial Lucene enables sorting and filtering based on geographic distance. You can perform remote searching (over RMI) using the contrib/remote module.

This chapter completes our coverage of Lucene’s contrib modules, but remember that Lucene’s sources are fast moving so it’s likely new packages are available by the time you read this. If in doubt, you should always check Lucene’s source code repository for the full listing of what new goodies are available.

Let’s begin with chaining filters.

9.1. Chaining filters

9.2. Storing an index in Berkeley DB

9.3. Synonyms from WordNet

9.4. Fast memory-based indices

9.5. XML QueryParser: Beyond “one box” search interfaces

9.6. Surround query language

9.7. Spatial Lucene

9.8. Searching multiple indexes remotely

9.9. Flexible QueryParser

9.10. Odds and ends

9.11. Summary