Letter to the HBase Community

 

Before we examine the current situation, please allow me to flash back a few years and look at the beginnings of HBase.

In 2007, when I was faced with using a large, scalable data store at literally no cost—because the project’s budget would not allow it—only a few choices were available. You could either use one of the free databases, such as MySQL or PostgreSQL, or a pure key/value store like Berkeley DB. Or you could develop something on your own and open up the playing field—which of course only a few of us were bold enough to attempt, at least in those days.

These solutions might have worked, but one of the major concerns was scalability. This feature wasn’t well developed and was often an afterthought to the existing systems. I had to store billions of documents, maintain a search index on them, and allow random updates to the data, while keeping index updates short. This led me to the third choice available that year: Hadoop and HBase.

Both had a strong pedigree, and they came out of Google, a Valhalla of the best talent that could be gathered when it comes to scalable systems. My belief was that if these systems could serve an audience as big as the world, their underlying foundations must be solid. Thus, I proposed to built my project with HBase (and Lucene, as a side note).

Preface

Acknowledgments

About this Book

Roadmap

Intended audience

Code conventions

Code downloads

Author Online

About the Authors