Chapter 1. Meet Lucene

 

This chapter covers

  • Learning about Lucene
  • Understanding the typical search application architecture
  • Using the basic indexing API
  • Working with the search API

Lucene is a powerful Java search library that lets you easily add search to any application. In recent years Lucene has become exceptionally popular and is now the most widely used information retrieval library: it powers the search features behind many websites and desktop applications. Although it’s written in Java, thanks to its popularity and the determination of zealous developers you now have at your disposal a number of ports or integrations to other programming languages (C/C++, C#, Ruby, Perl, Python, and PHP, among others).

One of the key factors behind Lucene’s popularity is its simplicity, but don’t let that fool you: under the hood sophisticated, state-of-the-art information retrieval techniques are quietly at work. The careful exposure of its indexing and searching API is a sign of the well-designed software. You don’t need in-depth knowledge about how Lucene’s information indexing and retrieval work in order to start using it. Moreover, Lucene’s straightforward API requires using only a handful of classes to get started. Finally, for those of you tired of bloatware, Lucene’s core JAR is refreshingly tiny—only 1 MB—and it has no dependencies!

1.1. Dealing with information explosion

1.2. What is Lucene?

1.3. Lucene and the components of a search application

1.4. Lucene in action: a sample application

1.5. Understanding the core indexing classes

1.6. Understanding the core searching classes

1.7. Summary

sitemap