Module De.Lz77

type src = [
| `Channel of Stdlib.in_channel
| `String of string
| `Manual
]

The type for input sources. With a `Manual source the client must provide input with src. With `String or `Channel source the client can safely discard `Await case (with assert false).

type decode = [
| `Flush
| `Await
| `End
]
type state

The type for states.

val literals : state -> literals

literals s is frequencies of lengths and literals emitted by s since it was created.

val distances : state -> distances

distances s is frequencies of distances emitted by s since it was created.

val checksum : state -> optint

checksum s is ADLER-32 checksum of consumed inputs.

val src : state -> bigstring -> int -> int -> unit

src s i j l provides s with l bytes to read, starting at j in i. This byte range is read by calls to compress with s until `Await is returned. To signal the end of input call the function with l = 0.

  • raises Invalid_argument

    when j and l do not correspond to a valid range.

val src_rem : state -> int

src_rem s is how many bytes it remains in given input buffer.

val compress : state -> decode

compress s is:

  • `Await if s has a `Manual input source and awits for more input. The client must use src to provide it.
  • `Flush if s filled completely the shared-queue q (given in state). Queue.junk_exn or Queue.pop_exn can be used to give some free cells to compress.
  • `End if s compressed all input. Given shared-queue q is possibly not empty.
type window
val make_window : bits:int -> window
val state : ?level:int -> q:Queue.t -> w:window -> src -> state

state src ~w ~q is an state that inputs from src and that outputs to q.

Window.

The client can constrain lookup operation by a window. Small window enforces compress to emit small distances. However, large window allows compress to go furthermore to recognize a pattern which can be expensive.

Level.

Lz77 has mainly 2 levels:

  • 0 where we only copy inputs to outpus, we don't do a lookup
  • n (to 9) with a certain configuration of the lookup. The higher the level, the longer it may take to find a pattern.

The 0 can be useful to only pack an input into a format such as DEFLATE - as an already compressed document such as a video or an image. Otherwise, 4 as the level is pretty common.

val no_compression : state -> bool