Chapter 7. Writing a Lucene query

 

This chapter covers

  • Parsing and the QueryParser syntax
  • The QueryParser and user-friendly query entry
  • Tokenization and analyzers
  • Lucene’s base Query classes

You’ve purchased something online. Maybe it’s a book or clothing from a department store with an online presence. Your order is a week overdue, so you go back to the website to check its status, but you’ve misplaced the order number. You call the contact number and are told, “I’m sorry, I can only look up your order if you can give me your Order ID.” Oh-oh, does this sound familiar?

These all-or-nothing database-style searches are quickly being overtaken by the search techniques we discuss in this book. These much more flexible methods can query for a document where the title contains Wright Brothers and the body contains bicycle. Just about any way you can think of searching for something can be converted into a data query.

Information indexing is a standard, rigid process, but querying that gathered information can be performed in myriad ways. This process and the building of these queries is the subject of this chapter.

We’ll begin by studying the QueryParser, how it parses expressions and allows for user-friendly queries and the syntax it generates from our queries. Understanding this syntax is important when you run into problems. What could possibly be wrong if you query for a person’s last name such as Smith-Jones and obtain no results when you’re positive the name exists in the index?

7.1. Understanding Lucene’s query syntax

7.2. Tokenization and fields

7.3. Building custom queries programmatically

7.4. Summary

sitemap