Iota Binary

Sources of inspiration:

  • WIT canonical representation: concepts, binary encoding, opaque resources, structural typing
  • capnproto: messages, schema language with expressive types, RPC (?)
  • Apache Arrow columnar format: run-end encoding, binary dictionaries and lists, use of indirection.
  • FIDL and its binary encoding: tables, look for good ideas.

local in-memory representation

When data are used in some local context, they can rely on it for brevity: schemas, relative offsets, implicit meaning defined by a schema, etc. Data are not required to be identified by hashes, value hashes can be implicit and reconstructed when needed.

Local representation should allow for zero-copy access and ideally (append-oriented?) patching.

Considerations:

  • data locality, array-oriented way of storing (minimize separately allocated objects)
  • naturally mapped into the data model, if possible
  • minimize intrusive tags and metadata (tagging every integer/pointer is not an option, try to move as much as possible into context, use zero-sized types to carry context information in the compiler).

distributed context

Units of data must be fully self-contained: serializable, streamable, bringing their schema with them if needed.

Distributed contexts must be mergeable: two independent contexts on different machines must be easily mergeable into one context. That's why the distributed format must be content-addressable and encoded by Merkle trees ()

Inspirations: