chapter two

Chapter 2. Getting started

This chapter covers

Connecting to HBase and defining tables
The basic commands for interacting with HBase
Physical and logical data models of HBase
Queries over compound rowkeys

The goal of the next couple of chapters is to teach you how to use HBase. First and foremost, you’ll become comfortable with the features HBase provides you as an application developer. You’ll gain a handle on the logical data model presented by HBase, the various modes of interacting with HBase, and the details of how to use those APIs. Our other goal is to teach you HBase schema design. HBase has a different physical data model from the relational data systems you’re likely used to. We’ll teach you the basics of that physical model so that you can take advantage of it while designing schemas optimized for your applications.

To accomplish all these goals, you’ll build an application from scratch. Allow us to introduce TwitBase, a simplified clone of the social network Twitter, implemented entirely in HBase. We won’t cover all the features of Twitter and this isn’t intended to be a production-ready system. Instead, think of TwitBase as an early Twitter prototype. The key difference between this system and the early versions of Twitter is that TwitBase is designed with scale in mind and hence is backed by a data store that can help achieve that.

Chapter 2. Getting started

This chapter covers

2.1. Starting from scratch

2.2. Data manipulation

2.3. Data coordinates

2.4. Putting it all together

2.5. Data models

2.6. Table scans

2.7. Atomic operations

2.8. ACID semantics

2.9. Summary