appendix C Working with Avro, Protobuf, and JSON Schema
C.1 Apache Avro
An Avro schema is composed of two main types: primitive and complex. The supported primitive types are null
, boolean
, int
, long
, float
, double
, bytes
, and string
. Complex types are records
, enums
, arrays
, maps
, unions
, and fixed
. For a complete description of primitive and complex types, see the Avro documentation at http://mng.bz/Y7Oo and http://mng.bz/GZ2M, respectively.
In this book, you’ll mostly work with the complex type of record
as this corresponds to an object. You may also encounter a few examples using primitive values, but for the most part, we’ll stick with records. You’ll also use the union
type, especially when working with the RecordNameStrategy
and TopicRecordName
strategies. The union
type is represented as an array and is very useful. union
s allow you to specify that a field may be of one or more types. You’ll see union
types in the following examples.
The record
type contains the elements name
, doc
, aliases
, and fields
. The fields
element can have the properties name
, type
, doc
, default
, order
, and aliases
. Figure C.1 shows an Avro schema with some descriptions of what each of these properties is.
Figure C.1 Comparing a schema with nested records vs. using schema references
By default, when serializing records, Avro encodes fields in the listed order of the schema. Next, I want to discuss the default
and aliases
from the previously mentioned properties.