Module KV.Tree

Managing store's trees.

val path_t : path Irmin.Type.t
val step_t : step Irmin.Type.t
val metadata_t : metadata Irmin.Type.t
val contents_t : contents Irmin.Type.t
val contents_key_t : contents_key Irmin.Type.t
val node_t : node Irmin.Type.t
val hash_t : hash Irmin.Type.t

Tree provides immutable, in-memory partial mirror of the store, with lazy reads and delayed writes.

Trees are like staging area in Git: they are immutable temporary non-persistent areas (they disappear if the host crash), held in memory for efficiency, where reads are done lazily and writes are done only when needed on commit: if you modify a key twice, only the last change will be written to the store when you commit.

Constructors

val empty : unit -> tree

empty () is the empty tree. The empty tree does not have associated backend configuration values, as they can perform in-memory operation, independently of any given backend.

val singleton : path -> ?metadata:metadata -> contents -> tree

singleton k c is the tree with a single binding mapping the key k to the contents c.

val of_contents : ?metadata:metadata -> contents -> tree

of_contents c is the subtree built from the contents c.

val of_node : node -> tree

of_node n is the subtree built from the node n.

type elt = [
  1. | `Node of node
  2. | `Contents of contents * metadata
]

The type for tree elements.

val v : elt -> tree

General-purpose constructor for trees.

val kinded_hash_t : [ `Contents of hash * metadata | `Node of hash ] Irmin.Type.t
val pruned : [ `Contents of hash * metadata | `Node of hash ] -> tree

pruned h is a purely in-memory tree with the hash h. Such trees can be used as children of other in-memory tree nodes, for instance in order to compute the hash of the parent, but they cannot be dereferenced.

Any operation that would require loading the contents of a pruned node (e.g. calling find on one of its children) will instead raise a Pruned_hash exception. Attempting to export a tree containing pruned sub-trees to a repository will fail similarly.

val kind : tree -> path -> [ `Contents | `Node ] option Lwt.t

kind t k is the type of s in t. It could either be a tree node or some file contents. It is None if k is not present in t.

val is_empty : tree -> bool

is_empty t is true iff t is empty (i.e. a tree node with no children). Trees with kind = `Contents are never considered empty.

Diffs

val diff : tree -> tree -> (path * (contents * metadata) Irmin.Diff.t) list Lwt.t

diff x y is the difference of contents between x and y.

Manipulating Contents

exception Dangling_hash of {
  1. context : string;
  2. hash : hash;
}

The exception raised by functions that can force lazy tree nodes but do not return an explicit or_error.

exception Pruned_hash of {
  1. context : string;
  2. hash : hash;
}

The exception raised by functions that attempts to load pruned tree nodes.

exception Portable_value of {
  1. context : string;
}

The exception raised by functions that attemps to perform IO on a portable tree.

type error = [
  1. | `Dangling_hash of hash
  2. | `Pruned_hash of hash
  3. | `Portable_value
]
type 'a or_error = ('a, error) Stdlib.result
module Contents : sig ... end

Operations on lazy tree contents.

val mem : tree -> path -> bool Lwt.t

mem t k is true iff k is associated to some contents in t.

val find_all : tree -> path -> (contents * metadata) option Lwt.t

find_all t k is Some (b, m) if k is associated to the contents b and metadata m in t and None if k is not present in t.

val length : tree -> ?cache:bool -> path -> int Lwt.t

length t key is the number of files and sub-nodes stored under key in t.

It is equivalent to List.length (list t k) but backends might optimise this call: for instance it's a constant time operation in irmin-pack.

cache defaults to true, see caching for an explanation of the parameter.

val find : tree -> path -> contents option Lwt.t

find is similar to find_all but it discards metadata.

val get_all : tree -> path -> (contents * metadata) Lwt.t

Same as find_all but raise Invalid_arg if k is not present in t.

val list : tree -> ?offset:int -> ?length:int -> ?cache:bool -> path -> (step * tree) list Lwt.t

list t key is the list of files and sub-nodes stored under k in t. The result order is not specified but is stable.

offset and length are used for pagination.

cache defaults to true, see caching for an explanation of the parameter.

val seq : tree -> ?offset:int -> ?length:int -> ?cache:bool -> path -> (unit -> (step * tree) Stdlib__Seq.node) Lwt.t

seq t key follows the same behavior as list but returns a sequence.

val get : tree -> path -> contents Lwt.t

Same as get_all but ignore the metadata.

val add : tree -> path -> ?metadata:metadata -> contents -> tree Lwt.t

add t k c is the tree where the key k is bound to the contents c but is similar to t for other bindings.

val update : tree -> path -> ?metadata:metadata -> (contents option -> contents option) -> tree Lwt.t

update t k f is the tree t' that is the same as t for all keys except k, and whose binding for k is determined by f (find t k).

If k refers to an internal node of t, f is called with None to determine the value with which to replace it.

val remove : tree -> path -> tree Lwt.t

remove t k is the tree where k bindings has been removed but is similar to t for other bindings.

Manipulating Subtrees

val mem_tree : tree -> path -> bool Lwt.t

mem_tree t k is false iff find_tree k = None.

val find_tree : tree -> path -> tree option Lwt.t

find_tree t k is Some v if k is associated to v in t. It is None if k is not present in t.

val get_tree : tree -> path -> tree Lwt.t

get_tree t k is v if k is associated to v in t. Raise Invalid_arg if k is not present in t.

val add_tree : tree -> path -> tree -> tree Lwt.t

add_tree t k v is the tree where the key k is bound to the non-empty tree v but is similar to t for other bindings.

If v is empty, this is equivalent to remove t k.

val update_tree : tree -> path -> (tree option -> tree option) -> tree Lwt.t

update_tree t k f is the tree t' that is the same as t for all subtrees except under k, and whose subtree at k is determined by f (find_tree t k).

f returning either None or Some empty causes the subtree at k to be unbound (i.e. it is equivalent to remove t k).

val merge : tree Irmin.Merge.t

merge is the 3-way merge function for trees.

Folds

val destruct : tree -> [ `Node of node | `Contents of Contents.t * metadata ]

General-purpose destructor for trees.

type marks

The type for fold marks.

val empty_marks : unit -> marks

empty_marks () is an empty collection of marks.

type 'a force = [
  1. | `True
  2. | `False of path -> 'a -> 'a Lwt.t
]

The type for fold's force parameter. `True forces the fold to read the objects of the lazy nodes and contents. `False f is applying f on every lazy node and content value instead.

type uniq = [
  1. | `False
  2. | `True
  3. | `Marks of marks
]

The type for fold's uniq parameters. `False folds over all the nodes. `True does not recurse on nodes already seen. `Marks m uses the collection of marks m to store the cache of keys: the fold will modify m. This can be used for incremental folds.

type ('a, 'b) folder = path -> 'b -> 'a -> 'a Lwt.t

The type for fold's folders: pre, post, contents, node, and tree, where 'a is the accumulator and 'b is the item folded.

type depth = [
  1. | `Eq of int
  2. | `Le of int
  3. | `Lt of int
  4. | `Ge of int
  5. | `Gt of int
]

The type for fold depths.

  • Eq d folds over nodes and contents of depth exactly d.
  • Lt d folds over nodes and contents of depth strictly less than d.
  • Gt d folds over nodes and contents of depth strictly more than d.

Le d is Eq d and Lt d. Ge d is Eq d and Gt d.

val depth_t : depth Irmin.Type.t
val fold : ?order:[ `Sorted | `Undefined | `Random of Stdlib.Random.State.t ] -> ?force:'a force -> ?cache:bool -> ?uniq:uniq -> ?pre:('a, step list) folder -> ?post:('a, step list) folder -> ?depth:depth -> ?contents:('a, contents) folder -> ?node:('a, node) folder -> ?tree:('a, tree) folder -> tree -> 'a -> 'a Lwt.t

fold t acc folds over t's nodes with node-specific folders: contents, node, and tree, based on a node's kind.

The default for all folders is identity.

For every node n of t, including itself:

  • If n is a `Contents kind, call contents path c where c is the contents of n.
  • If n is a `Node kind, (1) call pre path steps; (2) call node path n; (3) recursively fold on each child; (4) call post path steps.
  • If n is any kind, call tree path t' where t' is the tree of n.

See examples/fold.ml for a demo of the different folders.

See force for details about the force parameters. By default it is `True.

See uniq for details about the uniq parameters. By default it is `False.

The fold depth is controlled by the depth parameter.

cache defaults to false, see caching for an explanation of the parameter.

If order is `Sorted (the default), the elements are traversed in lexicographic order of their keys. If `Random state, they are traversed in a random order. For large nodes, these two modes are memory-consuming, use `Undefined for a more memory efficient fold.

Stats

type stats = {
  1. nodes : int;
    (*

    Number of node.

    *)
  2. leafs : int;
    (*

    Number of leafs.

    *)
  3. skips : int;
    (*

    Number of lazy nodes.

    *)
  4. depth : int;
    (*

    Maximal depth.

    *)
  5. width : int;
    (*

    Maximal width.

    *)
}

The type for tree stats.

val stats_t : stats Irmin.Type.t
val stats : ?force:bool -> tree -> stats Lwt.t

stats ~force t are t's statistics. If force is true, this will force the reading of lazy nodes. By default it is false.

Concrete Trees

type concrete = [
  1. | `Tree of (step * concrete) list
  2. | `Contents of contents * metadata
]

The type for concrete trees.

val concrete_t : concrete Irmin.Type.t
val of_concrete : concrete -> tree

of_concrete c is the subtree equivalent of the concrete tree c.

  • raises Invalid_argument

    if c contains duplicate bindings for a given path.

val to_concrete : tree -> concrete Lwt.t

to_concrete t is the concrete tree equivalent of the subtree t.

Proofs

module Proof : sig ... end

Caches

val clear : ?depth:int -> tree -> unit

clear ?depth t clears all caches in the tree t for subtrees with a depth higher than depth. If depth is not set, all of the subtrees are cleared.

A call to clear doesn't discard the subtrees of t, only their cache are discarded. Even the lazily loaded and unmodified subtrees remain.

Performance counters

type counters = {
  1. mutable contents_hash : int;
  2. mutable contents_find : int;
  3. mutable contents_add : int;
  4. mutable contents_mem : int;
  5. mutable node_hash : int;
  6. mutable node_mem : int;
  7. mutable node_index : int;
  8. mutable node_add : int;
  9. mutable node_find : int;
  10. mutable node_val_v : int;
  11. mutable node_val_find : int;
  12. mutable node_val_list : int;
}
val counters : unit -> counters
val dump_counters : unit Fmt.t
val reset_counters : unit -> unit
val inspect : tree -> [ `Contents | `Node of [ `Map | `Key | `Value | `Portable_dirty | `Pruned ] ]

inspect t is similar to kind, with additional state information for nodes. It is primarily useful for debugging and testing.

If t holds a node, additional information about its state is included:

  • `Map, if t is from of_concrete.
  • `Value, if t's node has modifications that have not been persisted to a store.
  • `Portable_dirty, if t's node has modifications and is Node.Portable. Currently only used with Proof.
  • `Pruned, if t is from pruned.
  • Otherwise `Key, the default state for a node loaded from a store.
module Private : sig ... end

pp is a pretty-printer for a tree.

Import/Export

type kinded_key = [
  1. | `Contents of contents_key * metadata
  2. | `Node of node_key
]

Keys in the Irmin store are tagged with the type of the value they reference (either contents or node). In the contents case, the key is paired with corresponding metadata.

val kinded_key_t : kinded_key Irmin.Type.t
val key : tree -> kinded_key option

key t is the key of tree t in the underlying repository, if it exists. Tree objects that exist entirely in memory (such as those built with of_concrete) have no backend key until they are exported to a repository, and so will return None.

val find_key : Repo.t -> tree -> kinded_key option Lwt.t

find_key r t is the key of a tree object with the same hash as t in r, if such a key exists and is indexed.

val of_key : Repo.t -> kinded_key -> tree option Lwt.t

of_key r h is the tree object in r having h as key, or None if no such tree object exists.

val shallow : Repo.t -> kinded_key -> tree

shallow r h is the shallow tree object with the key h. No check is performed to verify if h actually exists in r.

val hash : ?cache:bool -> tree -> hash

hash t is the hash of tree t.

type kinded_hash = [
  1. | `Contents of hash * metadata
  2. | `Node of hash
]

Like kinded_key, but with hashes as value references rather than keys.

val kinded_hash : ?cache:bool -> tree -> kinded_hash

kinded_hash t is c's kinded hash.

val of_hash : Repo.t -> kinded_hash -> tree option Lwt.t

of_hash r h is the tree object in r with hash h, or None if no such tree object is indexed in r.

Note: in stores for which node_key = contents_key = hash, this function has identical behaviour to of_key.

Proofs

type 'result producer := repo -> kinded_key -> (tree -> (tree * 'result) Lwt.t) -> (Proof.t * 'result) Lwt.t

produce r h f runs f on top of a real store r, producing a proof and a result using the initial root hash h.

The trees produced during f's computation will carry the full history of reads. This history will be reset when f is complete so subtrees escaping the scope of f will not cause memory leaks.

Calling produce_proof recursively has an undefined behaviour.

type verifier_error = [
  1. | `Proof_mismatch of string
]

The type for errors associated with functions that verify proofs.

val verifier_error_t : verifier_error Irmin.Type.t
type 'result verifier := Proof.t -> (tree -> (tree * 'result) Lwt.t) -> (tree * 'result, verifier_error) Stdlib.result Lwt.t

verify p f runs f in checking mode. f is a function that takes a tree as input and returns a new version of the tree and a result. p is a proof, that is a minimal representation of the tree that contains what f should be expecting.

Therefore, contrary to trees found in a storage, the contents of the trees passed to f may not be available. For this reason, looking up a value at some path can now produce three distinct outcomes:

  • A value v is present in the proof p and returned : find tree path is a promise returning Some v;
  • path is known to have no value in tree : find tree path is a promise returning None; and
  • path is known to have a value in tree but p does not provide it because f should not need it: verify returns an error classifying path as an invalid path (see below).

The same semantics apply to all operations on the tree t passed to f and on all operations on the trees built from f.

The generated tree is the tree after f has completed. That tree is disconnected from the backend. It is possible to run operations on it as long as they don't require loading shallowed subtrees, otherwise it would raise Dangling_hash.

The result is Error _ if the proof is rejected:

  • when p.before is different from the hash of p.state;
  • when p.after is different from the hash of f p.state;
  • when f p.state tries to access paths invalid paths in p.state;
val produce_proof : 'a producer

produce_proof is the producer of tree proofs.

val verify_proof : 'a verifier

verify_proof is the verifier of tree proofs.

val hash_of_proof_state : Proof.tree -> kinded_hash