Data Serialization - AVRO

> (Data|State) Management and Processing > Data persistence

1 - About

Avro is a data serialization system.

Avro provides:

  • Rich data structures
  • Schema stored in the file
  • A compact, fast, binary data format (compressible file formats)
  • A container file, to store persistent data.
  • Simple integration with dynamic languages.

Avro requires a schema during data serialization, but also during data deserialization. Because the schema is provided at decoding time, metadata such as the field names don’t have to be explicitly encoded in the data. This makes the binary encoding of Avro data very compact.

Advertising

3 - Client

Avro supports code generation using the schema to automatically generate classes that can read and write Avro data. Code generation is not required to read or write data files nor to use or implement RPC protocols. Code generation as an optional optimization, only worth implementing for statically typed languages.

Schema can be extracted.

4 - Library

5 - Documentation / Reference

data/persistence/avro.txt · Last modified: 2018/12/19 20:31 by gerardnico