View on GitHub

OCamlverse

Documenting everything about OCaml

Edit

Compiler

The OCaml compiler is a complicated piece of software. Below is an attempt to document information that could not fit easily into the codebase, including relevant papers. Feel free to break this up into pages as the need arises.

Articles

Interesting Branches of the Compiler

Commands

  • ocamlc -config: show all configuration parameters for the compiler. Very useful.

Runtime

See also Runtime

Articles

Compiler Internals

  • hacking.adoc: a basic guide to the compiler’s internals.

Driver

The compiler driver, residing in the /driver directory, runs the entire compilation process from start to finish. The 2 entry points into the system are optmain.ml for the native compiler and main.ml for the bytecode compiler, setting up 2 separate execution paths through the code.

Both paths go through the pparse.ml file, which handles PPX rewriters. This file dumps the current parsed AST, calls a given PPX executable, and reloads the resulting AST.

The two compilation files are compile.ml for bytecode, and optcompile.ml for native. Both files pipe the different kinds of data through all the compilation stages. While native compilation has options for either clambda (naive) or flambda (optimized) compilation, bytecode compilation currently has only one mode, which is equivalent to clambda compilation.

Parser

The parser converts OCaml syntax to an abstract sytnax tree (AST) representation (parsing/parsetree.mli).

ppx

PPX rewriters are separate executables that parse binary AST, modify certain parts as needed, and spit out binary AST for the compiler to reload.

Typechecker

The typechecker transforms the plain AST to typechecked AST (typing/typedtree.mli).

Lambda

After typechecking, if a program isn’t rejected, types are mostly erased from the AST except information relevant to optimizations. The resulting AST (lambda/lambda.mli) is leaner and easier to manipulate than the typed AST.

Pattern Matching

Pattern matching uses a fairly complex algorithm (lambda/matching.mli) to convert potentially complex patterns into simpler, efficient AST.

Flambda

Flambda is an optional, additional layer of optimization, residing in /middle_end.

Clambda

Clambda is an expansion of the Lambda AST. It also includes some more low-level concerns, such as explicit closures.

cmm

cmm is an extremely low-level language, concerning itself with machine language (Assembly) and its optimization. At this level, the original high level OCaml code is hard to recognize.

Register Coloring

assembly

The actual machine code ultimately produced by the native compiler.