MirageRelease v4.10.3
MirageOS is a library operating system that can build standalone unikernels on various platforms. More precisely, the architecture can be divided into:
It is possible to write high-level MirageOS applications, such as HTTPS, email or CalDAV servers which can be deployed on very heterogenous and embedded platforms by changing only a few compilation parameters. The supported platforms range from minimal virtual machines running on cloud providers, or processes running inside Docker containers configured with a tight security profile. In general, these platform do not have a full POSIX environment; MirageOS does not try to emulate POSIX and focuses on providing a small, well-defined, typed interface with the system components. The nearest equivalent to the MirageOS approach is the WASI (wasi.dev) set of interfaces for WebAssembly.
While most of the code is written in OCaml, a typed, high-level language with many good safety properties, there are pieces of MirageOS which are still written in C. These bits can be separated in three categories:
The MirageOS compiler is basically a cross-compiler, where the host and target toolchain are identical, but with different flags for the C bindings: for instance, it is necessary to pass -freestanding to all C bindings to not use POSIX headers. The MirageOS compiler also uses a custom linker: eg. not only it needs a custom OCaml's runtime libasmrun.a, but it also needs to run a different linker to generate specialised executable images.
Historically, the OCaml ecosystem always had partial support for cross-compilation: for instance, the ocaml-cross way of doing it is to duplicate all existing opam pacakges by adding a -windows suffix to their names and dependencies; this allows normal packages and windows packages can be co-installed in the same opam switch.
MirageOS 3.x solves this by duplicating only the packages defining C bindings. It relies on every MirageOS backend registering a set of CFLAGS with pkg-config. Then every bindings uses pkg-config to configure their CFLAGS and ocamlfind to register link-time predicates, e.g. additional link time options like the name of the C archives. Finally, the final link step is done by querying ocamlfind (using the custom registered predicates) to link the list of dependencies' objects files with the result of OCam compiler's --output-obj option.
MirageOS 4 solves this by relying on dune's built-in support for cross-compilation. This is done by gathering all the sources of the dependencies locally with opam-monorepo, and by creating a `dune-workspace` file describing the C flags to use in each cross-compilation "context". Once this is set-up, only one dune build can cross-compile the unikernel target with all its local sources.
The rest of the document describes Functoria, the embedded domain-specific language to be used in config.ml files, to described how the typed libraries have to be assembled.
type 'a typ = 'a Functoria.Type.tThe type for values representing module types.
val typ : 'a -> 'a typtype t is a value representing the module type t.
Construct a functor type from a type and an existing functor type. This corresponds to prepending a parameter to the list of functor parameters. For example:
kv_ro @-> ip @-> kv_roThis describes a functor type that accepts two arguments -- a kv_ro and an ip device -- and returns a kv_ro.
type 'a impl = 'a Functoria.Impl.tThe type for values representing module implementations.
type abstract_impl = Functoria.Impl.abstractSame as impl but with hidden type.
val dep : 'a impl -> abstract_impldep t is the (build-time) dependency towards t.
type 'a key = 'a Functoria.Key.keyThe type for configure-time command-line arguments.
type 'a runtime_arg = 'a Functoria.Runtime_arg.argThe type for runtime command-line arguments.
val runtime_arg :
pos:(string * int * int * int) ->
?packages:Functoria.Package.t list ->
string ->
Functoria.Runtime_arg.truntime_arg ~pos ?packages v is the runtime argument pointing to the value v. pos is expected to be __POS__. packages specifies in which opam package the value v is defined.
type abstract_key = Functoria.Key.tThe type for abstract keys.
type context = Functoria.Context.tThe type for keys' parsing context. See Key.context.
type 'a value = 'a Functoria.Key.valueThe type for values parsed from the command-line. See Key.value.
val key : 'a key -> Functoria.Key.tkey k is an untyped representation of k.
if_impl v impl1 impl2 is impl1 if v is resolved to true and impl2 otherwise.
match_impl v cases ~default chooses the implementation amongst cases by matching the v's value. default is chosen if no value matches.
For specifying opam package dependencies, the type package is used. It consists of the opam package name, the ocamlfind names, and optional lower and upper bounds. The version constraints are merged with other modules.
type package = Functoria.Package.tThe type for opam packages.
type scope = Functoria.Package.scopeInstallation scope of a package.
val package :
?scope:scope ->
?build:bool ->
?sublibs:string list ->
?libs:string list ->
?min:string ->
?max:string ->
?pin:string ->
?pin_version:string ->
string ->
packagepackage ~scope ~build ~sublibs ~libs ~min ~max ~pin opam is a package. Build indicates a build-time dependency only, defaults to false. The library name is by default the same as opam, you can specify ~sublibs to add additional sublibraries (e.g. ~sublibs:["mirage"] "foo" will result in the library names ["foo"; "foo.mirage"]. In case the library name is disjoint (or empty), use ~libs. Specifying both ~libs and ~sublibs leads to an invalid argument. Version constraints are given as min (inclusive) and max (exclusive). If pin is provided, a pin-depends is generated, pin_version is "dev" by default. ~scope specifies the installation location of the package.
Values of type impl are tied to concrete module implementation with the device and main construct. Module implementations of type job can then be registered into an application builder. The builder is in charge if parsing the command-line arguments and of generating code for the final application. See Functoria.Lib for details.
type info = Functoria.Info.tThe type for build information.
val main :
?pos:(string * int * int * int) ->
?packages:package list ->
?packages_v:package list value ->
?local_libs:string list ->
?runtime_args:Functoria.Runtime_arg.t list ->
?deps:abstract_impl list ->
string ->
'a typ ->
'a implmain name typ is the functor name, having the module type typ. The connect code will call <name>.start.
packages or packages_v is set, then the given packages are installed before compiling the current application.type 'a code = 'a Functoria.Device.codeval code :
pos:(string * int * int * int) ->
('a, Stdlib.Format.formatter, unit, 'b code) Stdlib.format4 ->
'atype 'a device = ('a, abstract_impl) Functoria.Device.tval impl :
?packages:package list ->
?packages_v:package list Functoria.Key.value ->
?local_libs:string list ->
?install:(Functoria.Info.t -> Functoria.Install.t) ->
?install_v:(Functoria.Info.t -> Functoria.Install.t Functoria.Key.value) ->
?keys:Functoria.Key.t list ->
?runtime_args:Functoria.Runtime_arg.t list ->
?extra_deps:abstract_impl list ->
?connect:(info -> string -> string list -> 'a code) ->
?dune:(info -> Functoria.Dune.stanza list) ->
?configure:(info -> unit Functoria.Action.t) ->
?files:(info -> Fpath.t list) ->
string ->
'a typ ->
'a implimpl ~packages ~packages_v ~install ~install_v ~keys ~runtime_args ~extra_deps ~connect ~dune ~configure ~files module_name module_type is an implementation of the device constructed by the arguments. packages and packages_v are the dependencies (where packages_v is inside Key.value). install and install_v are the install instructions (used in the generated opam file), keys are the configuration-time keys, runtime_args the arguments at runtime, extra_deps are a list of extra dependencies (other implementations), connect is the code emitted for initializing the device, dune are dune stanzas added to the build rule, configure are commands executed at the configuration phase, files are files to be added to the list of generated files, module_name is the name of the device module, and module_type is the type of the module.
module Key : module type of struct include Devices.Key endConfiguration keys.
module Runtime_arg : module type of struct include Devices.Runtime_arg endConfiguration keys.
For the Qubes target, the Qubes database from which to look up dynamic runtime configuration information.
A default qubes database, guessed from the usual valid configurations.
A ptime mock implementation where you can manually set the clock via Mirage_ptime_set.
A mtime mock implementation where you can manually set the clock via Mirage_mtime_set.
default_reporter ?level () is the log reporter that prints log messages to the console, with a timestamp as prefix. level is the default log threshold. It is Some Logs.Info if not specified.
Default PRNG device to be used in unikernels. It uses getrandom/getentropy on Unix, and a Fortuna PRNG on other targets.
Use the given XenStore ID (ex: /dev/xvdi1 or 51760) as a raw block device.
Crunch a directory. The contents of the directory is transformed into OCaml code, which is then compiled as part of the unikernel.
Direct access to the underlying filesystem as a key/value store for Unix. For other backends, this is equivalent to crunch.
Generic key/value that will choose dynamically between direct_kv_ro and crunch. To use a filesystem implementation, try kv_ro_of_fs.
If no key is provided, a new Key.kv_ro is created with the group argument.
val docteur :
?mode:[ `Fast | `Light ] ->
?name:string key ->
?output:string key ->
?analyze:bool runtime_arg ->
?branch:string ->
?extra_deps:string list ->
string ->
kv_ro impldocteur ?mode ?name ?output ?analyze remote is a read-only, key-value store device. Data is stored on that device using the Git PACK file format, version 2. This format has very good compression factors for many similar files of relatively small size. For instance, 14Gb of HTML files can be compressed into a disk image of 240Mb.
Unlike crunch, docteur produces an external image which means that less memory is used to keep and get files. The image can be produced from many sources:
file://path/to/the/git/repository/)file://path/to/a/simple/directory/)git clone expects)If you use a Git repository, you can choose a specific branch with the ?branch argument (like refs/heads/main). Otherwise, this argument is ignored.
If you use a simple directory, it can be a relative from your unikernel project (relativize://directory) or an absolute path (file://home/user/directory).
If a required file is produced by a dune rule, you must notice it via the extra_deps argument.
For a Solo5 target, users must attach the image as a block device:
$ solo5-hvt --block:<name>=<path-to-the-image> -- unikernel.{hvt,...}The user is able to specify the name of the block device (default to "docteur"). The user can also specify the output of docteur.make, the tool which generate the image (default to "disk.img").
For the Unix target, the program open the image at the beginning of the process. An integrity check of the image can be done via the analyze value (defaults to true).
It's possible to use the file-system into 2 modes:
`Light: any access requires that we reconstruct the path to the requested file. That means that we will need to extract a few additional objects before the extraction of the requested one. `Light does not cache anything in memory but it can be slower if the requested file is deep in the directory structure.`Fast: reconstructs and cache the layout of the directory structure when the unikernel starts: it might increase boot-time and bigger memory requirements. However, `Fast allows the device to decode only the requested object so it is faster than the `Light mode.Direct access to the underlying filesystem as a key/value store. Only available on Unix backends.
val chamelon : program_block_size:int runtime_arg -> block impl -> kv_rw implchamelon ~program_block_size returns a kv_rw filesystem which is an implementation of littlefs in OCaml. The chamelon device expects a block-device.
unikernel.ml:
open Cmdliner
let program_block_size =
Arg.(value & opt int 16 & info [ "program-block-size" ])config.ml:
let db =
let program_block_size =
Runtime_arg.create ~pos:__POS__ "Unikernel.program_block_size"
in
let block = block_of_file "db" in
chamelon ~program_block_size block
inFor Solo5 targets, you finally can launch the unikernel with:
$ solo5-hvt --block:db=db.img unikernel.hvtThe block-device must be well-formed and formatted by the chamelon tool:
$ dd if=/dev/zero of=db.img bs=1M count=1
$ chamelon format db.img 512tar_kv_rw block is a read/write tar archive. Note that the filesystem is append-only. That is, files can generally not be removed, set_partial only works on what is allocated, and there are restrictions on rename.
val ccm_block :
?nonce_len:int ->
string option runtime_arg ->
block impl ->
block implccm_block key block returns a new block which is a AES-CCM encrypted disk.
Note also that the available size of an encrypted block is always divided by 2 of its real size: a 512M block will only be able to contain 256M data if it is encrypted.
You can either use a fresh block device as encrypted storage. This does not need any preparation, just using ccm_block with the desired key. If you have an existing disk image that you want to encrypt, you can use the ccmblock tool given by the mirage-block-ccm opam package.
$ ccmblock enc -i db.img -k 0x10786d3a9c920d0b3ec80dfaaac557a7 -o edb.imgAccept the key as a runtime argument, in unikernel.ml:
open Cmdliner
let aes_ccm_key =
let doc = "The key of the block device (hex formatted)" in
Arg.(required & opt (some string) None & info ~doc [ "aes-ccm-key" ])Then, into you config.ml, you just need to compose your block device with ccm_block:
let encrypted_block =
let aes_ccm_key =
Runtime_arg.create ~pos:__POS__ "Unikernel.aes_ccm_key"
in
let block = block_of_file "edb"
ccm_block aes_ccm_key block
inFinally, with Solo5, you can launch your unikernel with that:
$ solo5-hvt --block:edb=edb.img \
--arg="--aes-ccm-key=0x10786d3a9c920d0b3ec80dfaaac557a7" \
unikernel.hvtYou can finally compose a file-system such as chamelon with this block device (and you have a encrypted file-system!):
let fs = chamelon ~program_block_size encrypted_blockdefault_network is a dynamic network implementation which attempts to do something reasonable based on the target.
A custom network interface. Exposes a Runtime_arg.interface key.
Implementations of the Tcpip.Ip.S signature.
Configure the interface via DHCP
Use an IPv4 address Exposes the keys Runtime_arg.V4.network and Runtime_arg.V4.gateway.
Use a given initialized QubesDB to look up and configure the appropriate * IPv4 interface.
Use an IPv6 address. Exposes the keys Runtime_arg.V6.network, Runtime_arg.V6.gateway.
val direct_stackv4v6 :
?group:string ->
?tcp:tcpv4v6 impl ->
network impl ->
ethernet impl ->
arpv4 impl ->
ipv4 impl ->
ipv6 impl ->
stackv4v6 implDirect network stack with given ip.
val generic_stackv4v6 :
?group:string ->
?dhcp_key:bool value ->
?net_key:[ `OCaml | `Host ] option value ->
?ipv4_network:Ipaddr.V4.Prefix.t ->
?ipv4_gateway:Ipaddr.V4.t ->
?ipv6_network:Ipaddr.V6.Prefix.t ->
?ipv6_gateway:Ipaddr.V6.t ->
?tcp:tcpv4v6 impl ->
network impl ->
stackv4v6 implGeneric stack using a net keys: Key.net.
net = host then the Unix sockets API is used;qubes, a special IPv4 stack using the QubesDB is used;dhcp is true, a DHCP client is used for the IPv4 address;If a key is not provided, it uses Key.net (with the group argument) to create it.
tcpv4v6 stackv4v6 is an helper to extract the TCP/IP stack regardless the UDP/IP stack expected by some devices such as protocols.
Happy-eyeballs is an implementation of RFC 8305 which specifies how to connect to a remote host using either IP protocol version 4 or IP protocol version 6 from a stackv4v6 network implementation.
The given device is able to resolve a remote host via a dns_client device and both must share the same stackv4v6 implementation.
val happy_eyeballs : happy_eyeballs typval generic_happy_eyeballs :
?group:string ->
?aaaa_timeout:int64 ->
?connect_delay:int64 ->
?connect_timeout:int64 ->
?resolve_timeout:int64 ->
?resolve_retries:int ->
?timer_interval:int64 ->
stackv4v6 impl ->
happy_eyeballs implgeneric_happy_eyeballs stackv4v6 creates a new happy-eyeballs value which is able to connect to a remote host and allocate finally a connected flow from the given network implementation stackv4v6. However, if you want to resolve (DNS resolution) & connect to a remote host, you must complete your unikernel with a generic_dns_client which upgrade the happy-eyeballs stack with a DNS resolution stack.
This device has several optional arguments of keys for timeouts specified in nanoseconds.
A DNS client is a module which implements:
getaddrinfo to request a query_type-dependent response to a nameserver regarding a domain-name such as the MX record.gethostbyname to request the A regarding a domain-namegethostbyname6 to request the AAAA record regarding a domain-nameval dns_client : dns_client typval generic_dns_client :
?group:string ->
?timeout:int64 ->
?nameservers:string list ->
?cache_size:int ->
stackv4v6 impl ->
happy_eyeballs impl ->
dns_client implgeneric_dns_client stackv4v6 happy_eyeballs creates a new DNS value which is able to resolve domain-name from nameservers. It requires a network and happy-eyeballs stack to communicate with these nameservers.
The nameservers argument is a list of strings. The format of them is:
udp:ipaddr(:port)? if you want to communicate with a DNS resolver via UDPtcp:ipaddr(:port)? if you want to communicate with a DNS resolver via TCP/IPtls:ipaddr(:port)?(!<authenticator>) if you to communicate with a DNS resolver via TLS. You are able to introduce an <authenticator> (please, follow the documentation about X509.Authenticator.of_string to get an explanation of its format). Otherwise, by default, we use trust anchors from NSS' certdata.txt.Syslog exfiltrates log messages (generated by libraries using the logs library) via a network connection. The log level of the log sources is controlled via the Mirage_runtime.logs key. The functionality is provided by the logs-syslog package.
Emit log messages via TLS, using the credentials (private key, certificate, trust anchor) provided in the KV_RO.
Monitoring
Monitor metrics to a remote Influx host, also allow adjustments to log sources and levels. The provided stack should not be publicly reachable.
For some implementations which requires to communicate with an external resources (such as a webserver or a git server), we must hide the underlying implementations that depend on the target (such as the network stack) and are necessary for these implementations.
The aim of mimic is to offer first of all the ability to initiate a TCP/IP connection independently of the chosen target (see mimic_happy_eyeballs).
The resulting device can then be composed with other protocols like TLS, Git or HTTP and it is through this resulting device that other devices can initiate an internet connection to a peer (like a webserver or a Git server).
val mimic_happy_eyeballs :
stackv4v6 impl ->
happy_eyeballs impl ->
dns_client impl ->
mimic implmimic_happy_eyeballs stackv4v6 happy_eyeballs dns_client creates a device which initiate a global happy-eyeballs loop. By this way, an underlying instance works to initiate a TCP/IP connection from an IP address or a domain-name.
For the domain-name resolution, we ask the happy-eyeballs instance to resolve the given domain-name via its DNS client.
The resulting device can be used and re-used to for any clients which need to initiate a connection (like alpn_client or git_tcp).
val http_client : http_client typcohttp_server starts a Cohttp server.
val http_server : http_server typval paf_server : port:int runtime_arg -> tcpv4v6 impl -> http_server implpaf_server ~port tcpv4v6 creates an instance which will start to listen on the given port. With this instance and the produced module HTTP_server, the user can initiate:
http/1.1 & h2) server with TLSThis is a simple example of how to launch an HTTP server: unikernel.ml
open Cmdliner
let port =
let doc = "Port of the HTTP service." in
Arg.(value & opt int 8080 & info [ "p"; "port" ])
module Make (HTTP_server : Paf_mirage.S with type ipaddr = Ipaddr.t) =
struct
let error_handler (_ipaddr, _port) ?request:_ _error _send = ()
let request_handler :
HTTP_server.TCP.flow -> Ipaddr.t * int -> Httpaf.Reqd.t -> unit =
fun _socket (_ipaddr, _port) reqd ->
let contents = "Hello World!\n" in
let headers =
Httpaf.Headers.of_list
[
("content-length", string_of_int (String.length contents));
("content-type", "text/plain");
("connection", "close");
]
in
let response = Httpaf.Response.create ~headers `OK in
Httpaf.Reqd.respond_with_string reqd response contents
let start http_server port =
let service =
HTTP_service.http_service ~error_handler request_handler
in
let (`Initialized thread) = HTTP_server.serve service http_server in
thread
endconfig.ml
open Mirage
let port = Runtime_arg.create ~pos:__POS__ "Unikernel.port"
let main = main "Unikernel.Make" (http_server @-> job)
let stackv4v6 = generic_stackv4v6 default_network
let http_server = paf_server ~port (tcpv4v6_of_stackv4v6 stackv4v6)
let () =
register "main"
~runtime_args:[ Runtime_arg.v port ]
[ main $ http_server ]val alpn_client : alpn_client typpaf_client tcpv4v6 dns creates an ALPN device which can do HTTP (http/1.1 & h2) requests as a HTTP client. The device allocated represents values required to initiate a connection to HTTP webservers. The user can, then, use the module Http_mirage_client.request to communicate with HTTP webservers. This is an example of how to use the ALPN devices:
unikernel.ml
module Make (HTTP_client : Http_mirage_client.S) = struct
let start http =
Http_mirage_client.request http "https://google.com"
(fun _response buf str -> Buffer.add_string buf str ; Lwt.return buf)
(Buffer.create 0x100) >>= function
| Ok (response, buf) ->
let body = Buffer.contents buf in
...
| Error _ -> ...
endconfig.ml
open Mirage
let main = main "Unikernel.Make" (alpn_client @-> job)
let stackv4v6 = generic_stackv4v6 default_network
let he = generic_happy_eyeballs stack
let dns = generic_dns_client stack he
let alpn_client =
let mimic = mimic_happy_eyeballs stackv4v6 he dns in
paf_client (tcpv4v6_of_stackv4v6 stackv4v6) mimic
let () = register "main" [ main $ alpn_client ]type argv = Functoria.argvdefault_argv is a dynamic argv implementation which attempts to do something reasonable based on the target.
Users can connect to a remote Git repository in many ways:
The devices defined below define these in composable ways. The git_client impl returned from them can be passed to Git or Irmin in order to be able to fetch and push from/into a Git repository.
The user is able to restrict or enlarge protocol possibilities needed for its application. For instance, the user is able to restrict only the SSH connection to communicate with a Git repository or the user can handle TCP/IP and SSH as possible protocols to communicate with a peer.
For instance, a device which is able to communicate via TCP/IP and SSH can be implemented like:
let he = generic_happy_eyeballs stack
let dns = generic_dns_client stack he
let git_client =
let mimic = mimic_happy_eyeballs stackv4v6 he dns in
let ssh =
git_ssh ~key ~password (tcpv4v6_of_stackv4v6 stackv4v6) mimic
in
let tcp = git_tcp (tcpv4v6_of_stackv4v6 stackv4v6) mimic in
merge_git_clients ssh tcpval git_client : git_client typval merge_git_clients : git_client impl -> git_client impl -> git_client implmerge_git_clients a b is a device that can connect to remote Git repositories using either the device a or the device b.
git_tcp tcpv4v6 dns is a device able to connect to a remote Git repository using TCP/IP.
val git_ssh :
?group:string ->
?authenticator:string ->
?key:string ->
?password:string ->
tcpv4v6 impl ->
mimic impl ->
git_client implgit_ssh ?group ?authenticator ?key ?password tcpv4v6 dns is a device able to connect to a remote Git repository using an SSH connection with the given private key or password. The identity of the remote Git repository can be verified using authenticator.
The format of the private key is: <type>:<seed or b64 encoded>. <type> can be rsa or ed25519 and, if the type is RSA, we expect the seed of the private key. Otherwise (if the type is Ed25519), we expect the b64-encoded private key.
The format of the authenticator is SHA256:<b64-encoded-public-key>, the output of:
$ ssh-keygen -lf <(ssh-keyscan -t rsa|ed25519 remote 2>/dev/null)val git_http :
?group:string ->
?authenticator:string ->
?headers:(string * string) list ->
tcpv4v6 impl ->
mimic impl ->
git_client implgit_http ?group ?authenticator ?headers tcpv4v6 dns is a device able to connect to a remote Git repository via an HTTP(S) connection, using the provided HTTP headers. The identity of the remote Git repository can be verified using authenticator.
The format of it is:
none no authenticationval register :
?argv:argv impl ->
?reporter:reporter impl ->
?sleep:sleep impl ->
?ptime:ptime impl ->
?mtime:mtime impl ->
?random:random impl ->
?src:[ `Auto | `None | `Some of string ] ->
string ->
job impl list ->
unitregister ~argv ~reporter ~src name jobs registers the application named by name which will executes the given jobs.
module Type = Functoria.Typemodule Impl = Functoria.Implmodule Info = Functoria.Infomodule Dune = Functoria.Dunemodule Action = Functoria.Actionmodule Context = Functoria.Contextmodule Project : sig ... endmodule Tool : sig ... end