View on GitHub Edit this page

OCamlverse

Documenting everything about OCaml

Concurrency, Parallelism, and Distributed Systems

Concurrency refers to running multiple computations more-or-less simultaneously, whereas parallelism refers to using multiple cores or OS-level threads to coordinate computation. We now know that the former is relatively safe and easy to reason about, whereas the latter is extremely difficult and causes many subtle bugs. OCaml currently supports concurrency elegantly, but parallelism support is not built in to the runtime.

Concurrency

  • lwt: a monadic concurrency library. Concurrent code uses monads to express the higher-level abstractions of control flow.
  • Async: another monadic concurrency library developed by Jane Street. This library is covered in Real World OCaml. While the concept is very similar to lwt, small discrepancies make compatibility between the libraries difficult.
  • RWO-lwt: Real World OCaml code examples translated from Async to lwt.

Articles

Process Management

  • The standard library contains the Unix module, which allows for low-level process management. This is fairly brittle due to the fact that it’s mostly (but not entirely) tailored towards Unix.
  • lwt has the lwt_process module, which has cross-platform process manipulation functions.

Parallelism

As mentioned above, OCaml currently doesn’t natively support multiple OS-level OCaml threads running simultaneously. A global lock prevents multiple OCaml threads from running simultaneously.

  • The most promising and powerful way to use multicore is with the new multicore branch. This branch uses a parallel garbage collector, which means that OCaml will eventually be able to run on multiple cores in the same process. Note that this branch is not yet ready for real work, but it’s rapidly advancing. For more information, consult the Multicore Wiki.
  • By interfacing with external C code through the FFI, OCaml can pass off long-running computations to C threads running at the same time as OCaml code. This is made easier nowadays due to CTypes (see ffi)
  • Parmap: provides easy-to-use parallel map and fold functions. The library makes use of forking to create short-lived child processes, and memory mapping to feed the data back to the parent process.
  • parallel: Distributed computing with lwt support.
  • ForkWork: a simple library for forking child processes to perform work on multiple cores.
  • Functory: a distributed computing library which facilitates distributed execution of parallelizable computations in a seamless fashion.
  • Ocamlnet: an enhanced system platform library. It contains the /netmulticore/ library to compute tasks on as many cores of the machine as needed. This is probably the most sophisticated implementation currently available, as it is capable of creating a shared memory region, and running a custom-made garbage collector on said region, thus solving the problem of sharing memory with tracing garbage collectors.
  • Nproc: a process pool implementation for OCaml.
  • Parany: another parallel computation library.
  • Sklml: a functional parallel skeleton compiler and programming system for OCaml programs.

Articles

Distributed Computing

  • Rpc.Parallel: a library for spawning processes on a cluster of machines, and passing typed messages between them.
  • MPI: message Passing Interface bindings for OCaml.
  • ocaml-rpc: light library to deal with RPCs in OCaml.
  • distributed: Library for distributed computation in OCaml. Similar to Erlang’s model and inspired by Cloud Haskell.