Module Irmin_traces.Trace_stat_summary

Conversion of a Stat_trace to a summary that is both pretty-printable and exportable to JSON.

The main type t here isn't versioned like a Stat_trace.t is.

Computing a summary may take a long time if the input Stat_trace is long. Count ~1000 commits per second.

This file is NOT meant to be used from Tezos, as opposed to some other "trace_*" files.

module Seq = Trace_common.Seq
type curve = Utils.curve
val curve_t : Utils.curve Repr.t
module Span : sig ... end

A stat trace can be chunked into blocks. A blocks is made of 2 phases, first the buildup and then the commit.

module Watched_node : sig ... end
type bag_stat = {
  1. value_before_commit : Vs.t;
  2. value_after_commit : Vs.t;
  3. diff_per_block : Vs.t;
  4. diff_per_buildup : Vs.t;
  5. diff_per_commit : Vs.t;
}

Summary of an entry contained in Def.bag_of_stat.

Properties of such a variables:

  • Is sampled before each commit operation.
  • Is sampled after each commit operation.
  • Is sampled in header.
  • Most of these entries are expected to grow linearly, it implies that no smoothing is necessary for the downsampled curve in these cases, and that the histogram is best viewed on a linear scale - as opposed to a log scale. The other entries are summarised using ~is_linearly_increasing:false.

The value_after_commit is initially fed with the value in the header (i.e. the value recorded just before the start of the play).

val bag_stat_t : bag_stat Repr.t
type finds = {
  1. total : bag_stat;
  2. from_staging : bag_stat;
  3. from_lru : bag_stat;
  4. from_pack_direct : bag_stat;
  5. from_pack_indexed : bag_stat;
  6. missing : bag_stat;
  7. cache_miss : bag_stat;
}
val finds_t : finds Repr.t
type pack = {
  1. finds : finds;
  2. appended_hashes : bag_stat;
  3. appended_offsets : bag_stat;
  4. inode_add : bag_stat;
  5. inode_remove : bag_stat;
  6. inode_of_seq : bag_stat;
  7. inode_of_raw : bag_stat;
  8. inode_rec_add : bag_stat;
  9. inode_rec_remove : bag_stat;
  10. inode_to_binv : bag_stat;
  11. inode_decode_bin : bag_stat;
  12. inode_encode_bin : bag_stat;
}
val pack_t : pack Repr.t
type tree = {
  1. contents_hash : bag_stat;
  2. contents_find : bag_stat;
  3. contents_add : bag_stat;
  4. node_hash : bag_stat;
  5. node_mem : bag_stat;
  6. node_add : bag_stat;
  7. node_find : bag_stat;
  8. node_val_v : bag_stat;
  9. node_val_find : bag_stat;
  10. node_val_list : bag_stat;
}
val tree_t : tree Repr.t
type index = {
  1. bytes_read : bag_stat;
  2. nb_reads : bag_stat;
  3. bytes_written : bag_stat;
  4. nb_writes : bag_stat;
  5. bytes_both : bag_stat;
  6. nb_both : bag_stat;
  7. nb_merge : bag_stat;
  8. cumu_data_bytes : bag_stat;
  9. merge_durations : float list;
}
val index_t : index Repr.t
type gc = {
  1. minor_words : bag_stat;
  2. promoted_words : bag_stat;
  3. major_words : bag_stat;
  4. minor_collections : bag_stat;
  5. major_collections : bag_stat;
  6. compactions : bag_stat;
  7. major_heap_bytes : bag_stat;
  8. major_heap_top_bytes : curve;
}
val gc_t : gc Repr.t
type disk = {
  1. index_data : bag_stat;
  2. index_log : bag_stat;
  3. index_log_async : bag_stat;
  4. store_dict : bag_stat;
  5. store_pack : bag_stat;
}
val disk_t : disk Repr.t
type store = {
  1. watched_nodes : Watched_node.map;
}
val store_t : store Repr.t
type t = {
  1. summary_timeofday : float;
  2. summary_hostname : string;
  3. curves_sample_count : int;
  4. moving_average_half_life_ratio : float;
  5. config : Def.config;
  6. hostname : string;
  7. word_size : int;
  8. timeofday : float;
  9. timestamp_wall0 : float;
  10. timestamp_cpu0 : float;
  11. elapsed_wall : float;
  12. elapsed_wall_over_blocks : Utils.curve;
  13. elapsed_cpu : float;
  14. elapsed_cpu_over_blocks : Utils.curve;
  15. op_count : int;
  16. span : Span.map;
  17. block_count : int;
  18. cpu_usage : Vs.t;
  19. index : index;
  20. pack : pack;
  21. tree : tree;
  22. gc : gc;
  23. disk : disk;
  24. store : store;
}
val t : t Repr.t
val create_vs : int -> evolution_smoothing:[ `Ema of float * float | `None ] -> scale:[ `Linear | `Log ] -> Vs.acc
val create_vs_exact : int -> Vs.acc
val create_vs_smooth : int -> Vs.acc
val create_vs_smooth_log : int -> Vs.acc
module Span_folder : sig ... end

Accumulator for the span field of t.

module Bag_stat_folder : sig ... end

Summary computation for statistics recorded in Def.bag_of_stat.

module Store_watched_nodes_folder : sig ... end

Accumulator for the store field of t.

val major_heap_top_bytes_folder : 'a Def.header_base -> int -> ([> `Commit of 'b Def.commit_base ], Utils.Resample.acc, float list) Utils.Parallel_folders.folder

Build a resampled curve of gc.top_heap_words

val elapsed_wall_over_blocks_folder : 'a Def.header_base -> int -> ([> `Commit of 'b Def.commit_base ], Utils.Resample.acc, float list) Utils.Parallel_folders.folder

Build a resampled curve of timestamps.

val elapsed_cpu_over_blocks_folder : 'a Def.header_base -> int -> ([> `Commit of 'b Def.commit_base ], Utils.Resample.acc, float list) Utils.Parallel_folders.folder

Build a resampled curve of timestamps.

val merge_durations_folder : (Def.pack Def.row_base, float list, float list) Utils.Parallel_folders.folder

Build a list of all the merge durations.

val cpu_usage_folder : 'a Def.header_base -> int -> ([> `Commit of 'b Def.commit_base ], float * float * Vs.acc, Vs.t) Utils.Parallel_folders.folder
val misc_stats_folder : 'a Def.header_base -> ([> `Commit of 'b Def.commit_base ], float * float * int, float * float * int) Utils.Parallel_folders.folder

Substract the first and the last timestamps and count the number of span.

val summarise' : Def.pack Def.header_base -> int -> Def.row Seq.t -> t

Fold over row_seq and produce the summary.

Parallel Folders

Almost all entries in t require to independently fold over the rows of the stat trace, but we want:

  • not to fully load the trace in memory,
  • not to reread the trace from disk once for each entry,
  • this current file to be verbose and simple,
  • to have fun with GADTs and avoid mutability.

All the boilerplate is hidden behind Utils.Parallel_folders, a datastructure that holds all folder functions, takes care of feeding the rows to those folders, and preseves the types.

In the code below, pf0 is the initial parallel folder, before the first accumulation. Each |+ ... statement declares a acc, accumulate, finalise triplet, i.e. a folder.

val acc : acc is the initial empty accumulation of a folder.

val accumulate : acc -> row -> acc needs to be folded over all rows of the stat trace. Calling Parallel_folders.accumulate pf row will feed row to every folders.

val finalise : acc -> v has to be applied on the final acc of a folder in order to produce the final value of that folder - which value is meant to be stored in Trace_stat_summary.t. Calling Parallel_folders.finalise pf will finalise all folders and pass their result to construct.

val summarise : ?block_count:int -> string -> t

Turn a stat trace into a summary.

The number of blocks to consider may be provided in order to truncate the summary.

val save_to_json : t -> string -> unit