Appendix C. Binary data and GridFS

 

For storing images, thumbnails, audio, and other binary files, many applications rely on the file system only. Although file systems provide fast access to files, file system storage can also can lead to organizational chaos. Consider that most file systems limit the number of files per directory. If you have millions of files to keep track of, then you need to devise a strategy for organizing files into multiple directories. Another difficulty involves metadata. Since the file metadata is still stored in a database, performing an accurate backup of the files and their metadata can be incredibly complicated.

For certain use cases, it may make sense to store files in the database itself because it simplifies file organization and backup. In MongoDB, you can use the BSON binary type to store any kind of binary data. This data type corresponds to the RDBMS BLOB (binary large object) type, and it’s the basis for two flavors of binary object storage provided by MongoDB.

The first uses one document per file and is best for smaller binary objects. If you need to catalog a large number of thumbnails or MD5s, then using single-document binary storage can make life much easier. On the other hand, you might want to store large images or audio files. In this case, GridFS, a Mongo DB API for storing binary objects of any size, would be a better choice. In the next two sections, you’ll see complete examples of both storage techniques.

C.1. Simple binary storage

C.2. GridFS

sitemap