Standard library

logic

  • logic.bool: the basic building block for data, the two-valued inductive type bool. Will become a machine bit.

  • logic.prop: the basic building block for proofs, types and reasoning about code.

  • logic.pair: the basic constructor for complex heterogeneous data. It's a building block for records.

  • logic.list: Type-wise: the basic constructor for homogeneous data. It's a building block for arrays, vectors, etc.

  • logic.lambda: Type-wise: structural expressions for untyped lambda, which are used later by a prover to manipulate terms. The basic building block for describing semantics of (unrestricted, not linear) code.

    Q: should this be dependently-typed right away?

  • logic.linear: Augments logic.lambda with linear typing. It's a basic building block for describing resources, e.g. heap allocations and memory segments. Describes the algebra of resources: allocation, single use, deallocation.

  • logic.ordered: Augments logic.lambda and logic.linear with an ordered type system (see Advanced Topics in Types and Programming Languages by Pierce). A building block for describing stacks and stacked contexts. Describes the algebrax of stacked resources, e.g. lifetimes.

  • logic.nat, logic.num, logic.fin, logic.field: Basic Peano numbers to be used in proofs. Their properties. Finite enumerations and sets. Especially of size $2^n$. (Finite?) fields and their properties. Especially on $2^n$-sized sets (Zig-like u2, u3, ..., u8, u9, ...).

  • logic.array: vec n T in dependent types. A fixed-size sequence of items of some type, indexed by fin. A special case: $2^n$-sized arrays (useful, because can be totally indexed by a bit sequence).

  • data.byte, data.word: Describe fixed-size sequences of bools: 8 bools for byte, 32/64 bools for word. Defines the usual CPU operations on them (shl, shr, not, and, ..., ). Defines $\tt{u8 } = \tt{finfield}(+, 2^8)$, and a mapping $byte \iff \rm{u8}$ The same for u64 as $\tt{word} \iff \tt{finfield}(+, 2^{64})$. Describes two-complement mapping $[-128 .. 128) \iff \tt{i8}$ and $[-2^{63} .. 2^{63}) \iff \tt{i64}$. Operations on signed numbers.

  • data.utf8: Describes a sequence of bytes which is a correct UTF8-encoded Unicode-ish string. A basic building block for symbols?

  • data.buffer: Describes a linearly typed array of bytes which can be read and written into (for byte, word, maybe utf8). Algebra of linearly typed arrays (ownership, splitting, swaps, etc). Special case: $2^n$-sized buffers.

  • data.bag (Q: what are symbols? Are they just utf8 byte sequences? Are they (opaque) words, e.g. pointers/hashes? Any additional properties on them?)

    Describes a JSON-structured data type, bag.

    <bag> ::= unit                /* json: null */
            | bool                /* json: bool */
            | <nat>               /* json: num */
            | <utf8>              /* json: str */
            | ([<sym> =] <bag>, ...)
            /* json: object/array */
    

    A bag a known-size associative array with a known set of symbols/nats as keys. Order of key-value pairs is preserved. Keys can be implicit, filled by incremented nats, like in Lua.

    Bags are values, their bit representation is not defined. Other things can be coerced to bags (records, any inductive data type structurally?).

  • data.parse: A small parser-combinator library.

lang

  • lang.syntax: Describes the language:

    • tokens: keywords, literals, symbols, (, ), ...
    • AST structure as an inductive data type (on top of token bags?). Each node has a bag of attributes.
  • lang.parser: Uses data.parse to build an AST from a utf8 byte sequence.

  • lang.binary: An interface to serialize/deserialize bags+types.

  • lang.record: A tuple <a bag of data, data type>. Generic (serde-like) serialization/deserialization interface.

  • lang.struct: Mapping (binary serialization/deserialization implementation) of a record onto a linear buffer.

  • lang.func, lang.coro? A func is a lambda that actually consumes/produces resources (stack frames, thread time, etc). Describes how lambda values become machine resources and run: stack frames, returns, calls, .... Calling conventions?

    A coro is an executable state machine.

    Q: closures?

core

core.mem

Provides: core.mem capability. Reading most (?) properties is safe (? but: ASLR). Destructive (i.e. write) changes are equivalent to having any process-wide capability. But: not every capability depends on core.mem (e.g. os.fs , os.user, net.ip.block are limits external to a process).

Objects:

  • address spaces: usually one, with its traits (32-bit, 64-bit, etc), but e.g. code space, data space for Harvard architecture.
  • memory segments: a set of $(offset, size)$ mappings with their own traits and allowed operations. Segments can grow (e.g. heap, stack). Can be executable, writable. Can be created/deleted (e.g. mmap/munmap, MMU operations). Q: are these just special, root memory regions?
  • memory regions (places?) (~ a resource that can hold a representation of a value, possibly complex one). Type encodings and transmutation, memory layout, alignment. Properties: offset, size, stride (may differ from size), read/write, volatile, executable, etc. Access patterns (random access, sequential access, etc).
  • references: a way to refer to a memory region for some purpose (e.g. reading, writing, managing, executing instructions). Operations: peek, poke. Explicitly shy away from pointers, especially as integers.

A remark about Rust: Rust does a very good job when relying on its set of assumptions (e.g. one address space, nonvolatile memory regions, internal ABI, pointers are integers), but is literally unsafe when stepping outside (e.g. lifetimes on JIT-compiled functions, memory segments going away underneath "safe" references, some other ABIs, NUMA, etc). Let's bring more clarity and explicitness here.

core.alloc

Requires: a core.mem capability. Provides: a core.alloc capability, negation of core.noalloc

Objects:

  • allocators: objects that can create/resize/consume memory places as resources.

Allocators can be created from a memory region and recycled themselves by their memory segment or parent region (e.g. enabling munmap). Allocators should be explicit (at least on a module level) and nudge to allocate memory closer to callstack roots, unlike the global allocator in Rust.

Allocation is fallible. Every allocation either should be handled, or means possibility of panic.

Allocated places are bound by the lifetime of their allocator

Allocators can be garbage-collecting, with no-op "freeing".