[DFDL-WG] Apache Avro™ 1.7.6 Documentation

Mike Beckerle mbeckerle.dfdl at gmail.com
Sat May 10 12:11:00 EDT 2014


The distinction is prescriptive vs descriptive. All the things you
mentioned are prescriptive. You use the format and gain the benefits
thereof. Dfdl is descriptive. You have data in some form a priori. You are
not choosing to represent it in some way. You have to describe the way the
data *is*.

DFDL is not something new. It is a standard designed from experience with
many commercial software systems that each have their own distinct and
proprietary way of describing data.

An important distinction is also about many to one vs point to point. To
communicate between two or a few systems you can select a preferred
technology. When you need to enable data interchange among hundreds of
different systems that you have no control or influence over the design of.
That is when the descriptive approach is most important.
 On May 10, 2014 3:08 AM, "Sill, Alan" <alan.sill at ttu.edu> wrote:

> Dear DFDL folks,
>
> Thought you would be interested in the following link. I'd be interested
> in comparisons of this (Apache Avro), the other two systems mentioned
> (Apache Thrift and Google Protocol Buffers) with DFDL in terms of goals,
> schema capabilities and general application potential.
>
> Alan
>
> Topic:
> Introduction
> Apache Avro%u2122 is a data serialization system.
>
> Avro provides:
>
> Rich data structures.
> A compact, fast, binary data format.
> A container file, to store persistent data.
> Remote procedure call (RPC).
> Simple integration with dynamic languages. Code generation is not required
> to read or write data files nor to use or implement RPC protocols. Code
> generation as an optional optimization, only worth implementing for
> statically typed languages.
> Schemas
> Avro relies on schemas. When Avro data is read, the schema used when
> writing it is always present. This permits each datum to be written with no
> per-value overheads, making serialization both fast and small. This also
> facilitates use with dynamic, scripting languages, since data, together
> with its schema, is fully self-describing.
>
> When Avro data is stored in a file, its schema is stored with it, so that
> files may be processed later by any program. If the program reading the
> data expects a different schema this can be easily resolved, since both
> schemas are present.
>
> When Avro is used in RPC, the client and server exchange schemas in the
> connection handshake. (This can be optimized so that, for most calls, no
> schemas are actually transmitted.) Since both client and server both have
> the other's full schema, correspondence between same named fields, missing
> fields, extra fields, etc. can all be easily resolved.
>
> Avro schemas are defined with JSON . This facilitates implementation in
> languages that already have JSON libraries.
>
> Comparison with other systems
> Avro provides functionality similar to systems such as Thrift, Protocol
> Buffers, etc. Avro differs from these systems in the following fundamental
> aspects.
>
> Dynamic typing: Avro does not require that code be generated. Data is
> always accompanied by a schema that permits full processing of that data
> without code generation, static datatypes, etc. This facilitates
> construction of generic data-processing systems and languages.
> Untagged data: Since the schema is present when data is read, considerably
> less type information need be encoded with data, resulting in smaller
> serialization size.
> No manually-assigned field IDs: When a schema changes, both the old and
> new schema are always present when processing data, so differences may be
> resolved symbolically, using field names.
> Apache Avro, Avro, Apache, and the Avro and Apache logos are trademarks of
> The Apache Software Foundation.
>
> Link:
> http://avro.apache.org/docs/current/
>
>
>
> --
>   dfdl-wg mailing list
>   dfdl-wg at ogf.org
>   https://www.ogf.org/mailman/listinfo/dfdl-wg
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20140510/b1e3315a/attachment.html>


More information about the dfdl-wg mailing list