Data Serialization - AVRO
Table of Contents
1 - About
Avro is a data serialization system.
- Rich data structures
- Schema stored in the file
- A compact, fast, binary data format (compressible file formats)
- A container file, to store persistent data.
- Simple integration with dynamic languages.
Avro requires a schema during data serialization, but also during data deserialization. Because the schema is provided at decoding time, metadata such as the field names don’t have to be explicitly encoded in the data. This makes the binary encoding of Avro data very compact.
2 - Articles Related
3 - Client
Avro supports code generation using the schema to automatically generate classes that can read and write Avro data. Code generation is not required to read or write data files nor to use or implement RPC protocols. Code generation as an optional optimization, only worth implementing for statically typed languages.
Schema can be extracted.
4 - Library
- spotify/dbeam SQL to Avro