Chapter 8. Scaling GIS on HBase
This chapter covers
In this chapter, we’ll use HBase to tackle a new domain: Geographic Information Systems (GIS). GIS is an interesting area of exploration because it poses two significant challenges: latency at scale and modeling spatial locality. We’ll use the lens of GIS to demonstrate how to adapt HBase to tackle these challenges. To do so, you’ll need to use domain-specific knowledge to your advantage.
Geographic systems are frequently used as the foundation of an online, interactive user experience. Consider a location-based service, such as Foursquare, Yelp, or Urban Spoon. These services strive to provide relevant information about hundreds of millions of locations all over the globe. Users of these applications depend on them to find, for instance, the nearest coffee shop in an unfamiliar neighborhood. They don’t want a MapReduce job standing between them and their latte. We’ve already discussed HBase as a platform for online data access, so this first constraint seems a reasonable match for HBase. Still, as you’ve seen in previous chapters, HBase can only provide low request latency when your schema is designed to use the physical storage of data. This brings you conveniently to the second challenge: spatial locality.