Module Encore

Encore, combinators to produce encoder and decoder.

The goal of encore is to provide combinators to be able to produce an angstrom's parser or a lavoisier's encoder.

Combinators are more limited than what angstrom can provide, but this limitation gives a chance to us to produce a lavoisier's encoder and be able to deserialize and serialize the value in the same way.

By this fact, we can ensure the isomorphism:

val p : 'v t

val angstrom : 'v Angstrom.t (* = to_angstrom p *)
val lavoisier : 'v Lavoisier.t (* = to_lavoisier p *)

assert (emit_string (parse_string str angstrom) lavoisier = str) ;
assert (parse_string (emit_string v lavoisier) angstrom = v) ;

To be able to make the serializer and the deserializer, the user must provide some bijective elements. To be able to parse and encode an int64, you must provide the way to get the value from a string and how you can encode your value to a string:

let int64 : (int64, string) = Bij.v
    ~fwd:Int64.of_string
    ~bwd:Int64.to_string

Then, you are able to play with combinators such as:

let p =
  let open Syntax in
  int64 <$> while1 is_digit

For some values such as Git values, we must respect isomorphism to ensure to inject/extract exactly the same representation of them into a store

Example.

Let's go about the tree Git object. The formal format of it is:

entry := permission ' ' name '\x00' hash tree := entry *

We must describe bijective elements such as:

let permission = Bij.v ~fwd:perm_of_string ~bwd:perm_to_string

let hash =
  Bij.v ~fwd:Digestif.SHA1.of_raw_string ~bwd:Digestif.SHA1.to_raw_string

type entry = { perm : permission; hash : Digestif.SHA1.t; name : string }

let entry =
  Bij.v
    ~fwd:(fun ((perm, name), hash) -> { perm; hash; name })
    ~bwd:(fun { perm; hash; name } -> ((perm, name), hash))

Note that these functions should raise Bij.Bijection if they fail when they parse the given string.

Then, the format of the entry can be described like:

let entry =
  let open Encore.Syntax in
  let permission = permission <$> while1 is_not_space in
  let hash = hash <$> fixed 20 in
  let name = while1 is_not_null in
  entry
  <$> (permission
      <* (Bij.char ' ' <$> any)
      <*> (name <* (Bij.char '\x00' <$> any))
      <*> hash
      <* commit)

And the tree Git object can be described like:

let tree = rep0 entry 

Finally, with tree and the design of encore, we can ensure:

let assert random_tree_value =
  let p = to_angstrom tree in
  let d = to_lavoisier tree in
  assert (Angstrom.parse_string ~consume:All p
            (Lavoisier.emit_string random_tree_value d) = random_tree_value)

The goal of such design is to describe only one time a format such as our tree Git object and ensure no corruption when we serialize/deserialize values. For our Git purpose, we ensure to keep the same SHA1 (which depends on contents).

module Bij : sig ... end
module Lavoisier : sig ... end
module Either : sig ... end
type 'a t

A encore combinator for values of type 'a.

val to_angstrom : 'a t -> 'a Angstrom.t

to_angstrom t is the parser of t.

val to_lavoisier : 'a t -> 'a Lavoisier.t

to_lavoisier t is the encoder/serializer of t.

module Syntax : sig ... end