Chapter 13. SolrCloud

 

This chapter covers

  • Scaling Solr with the SolrCloud architecture
  • Managing configuration information with ZooKeeper
  • Distributing indexing and queries
  • Administering the SolrCloud system
  • Shard splitting and custom hashing

In this chapter, you’ll learn how to design, configure, and operate a large-scale Solr cluster using a set of features known collectively as SolrCloud. This chapter will be challenging because there are new concepts and terminology with which you may not be familiar. Rest assured that by the end of this chapter, you should have a solid understanding of managing Solr clusters and a good feel for what it takes to set up and run a robust, large-scale distributed search engine.

There’s more theory in this chapter than hands-on activities. You’ll find that enabling SolrCloud mode is quite simple. Moreover, any existing client code for indexing and queries should not need to change. A SolrCloud cluster looks like any other Solr server to client applications.

To illustrate the core concepts in SolrCloud, we use the example of a search engine to power a log aggregation and analytics service called logmill. This fictitious application aggregates log messages from many systems into a centralized search engine to power monitoring, data visualization, and analytics of application activity.

13.1. Getting started with SolrCloud

13.2. Core concepts

13.3. Distributed indexing

13.4. Distributed search

13.5. Collections API

13.6. Basic system-administration tasks

13.7. Advanced topics

13.8. Summary