Module De.Higher

val compress : w:Lz77.window -> q:Queue.t -> refill:( bigstring -> int ) -> flush:( bigstring -> int -> unit ) -> bigstring -> bigstring -> unit

compress ~w ~q ~refill ~flush i o is Zlib.compress (with ~header:false) provided by camlzip package.

  • w is the window used by LZ77 compression algorithm.
  • q is shared-queue between compression algorithm and DEFLATE encoder.
  • i is input buffer.
  • o is output buffer.

When compress wants more input, it calls refill with i. The client returns how many bytes he wrote into i. If he returns 0, he signals end of input.

When compress has written output buffer, it calls flush with o and how many bytes it wrote. Bytes into o must be copied and they will be lost at the next call to flush.

A simple example of how to use such interface is:

let deflate_string str =
  let i = De.bigstring_create De.io_buffer_size in
  let o = De.bigstring_create De.io_buffer_size in
  let w = De.Lz77.make_window ~bits:15 in
  let q = De.Queue.create 0x1000 in
  let r = Buffer.create 0x1000 in
  let p = ref 0 in
  let refill buf =
    (* assert (buf == i); *)
    let len = min (String.length str - !p) De.io_buffer_size in
    Bigstringaf.blit_string str ~src_off:!p buf ~dst_off:0 ~len ;
    p := !p + len ; len in
  let flush buf len =
    (* assert (buf == o); *)
    let str = Bigstringaf.substring buf ~off:0 ~len in
    Buffer.add_string r str in
  De.Higher.compress ~w ~q ~refill ~flush i o ; Buffer.contents r

As you can see, we allocate several things such as input and output buffers. Such choice should be decided by the end-user - and it's why we don't provide such function. The speed or the compression ratio depends on the length of:

  • q which is shared between the compression algorithm and the encoder
  • i which is the input buffer (and allows a large lookup or not)
  • o which is the output buffer (it can be a bottle-neck for the throughput)
  • w which is the lookup-window
  • r which is the data-structure to save the output (it can be a buffer, a queue, a out_channel, etc.)

As we said, several choices depends on what you want and your context. We deliberately choose to be not responsible on these choices. It's why such function exists only as an example - and it's not a part of the distribution.

val uncompress : w:window -> refill:( bigstring -> int ) -> flush:( bigstring -> int -> unit ) -> bigstring -> bigstring -> ( unit, [> `Msg of string ] ) Stdlib.result

uncompress ~w ~refill ~flush i o is Zlib.uncompress (with ~header:false) provided by camlzip package.

  • w is the window used by LZ77 uncompression algorithm
  • i is input buffer.
  • o is output buffer.

When compress wants more input, it calls refill with i. The client returns how many bytes he wrote into i. If he returns 0, he signals end of input.

When compress has written output buffer, it calls flush with o and how many bytes it wrote. Bytes into o must be copied and they will be lost at the next call to flush.

A simple example of how to use such interface is:

let inflate_string str =
  let i = De.bigstring_create De.io_buffer_size in
  let o = De.bigstring_create De.io_buffer_size in
  let w = De.make_window ~bits:15 in
  let r = Buffer.create 0x1000 in
  let p = ref 0 in
  let refill buf =
    let len = min (String.length str - !p) De.io_buffer_size in
    Bigstringaf.blit_from_String str ~src_off:!p buf ~dst_off:0 ~len ;
    p := !p + len ; len in
  let flush buf len =
    let str = Bigstringaf.substring buf ~off:0 ~len in
    Buffer.add_string r buf in
  match De.Higher.uncompress ~w ~refill ~flush i o with
  | Ok () -> Ok (Buffer.contents r)
  | Error _ as err -> err

As you can see, we allocate several things such as input and output buffers. As compress, these details should be decided by the end-user. The speed of the decompression depends on the length of:

  • i which is the input buffer (it can a bottle-neck for the throughput)
  • o which is the output buffer (it can be a bottle-neck for the throughput)

The window depends on how you deflated/compressed the input. Usually, we allow a window of 15 bits.

val of_string : o:bigstring -> w:window -> string -> flush:( bigstring -> int -> unit ) -> ( unit, [> `Msg of string ] ) Stdlib.result
val to_string : ?buffer:int -> w:window -> q:Queue.t -> refill:( bigstring -> int ) -> bigstring -> string