Notes On The Design Of The Mercury Implementation

This file contains some information about the design of the whole Mercury implementation, in particular listing the different major subsystems and how we manage dependencies between different subsystems.

See also compiler_design.html for information on the design of the compiler subsystem.

Subsystems and subdirectories

The Mercury implementation is divided into several major subsystems. Each major subsystem is put in a different subdirectory.

In general, each subsystem is written in a single language; we prefer not to mix different languages in a single directory, and so if we plan to implement what is conceptually a single subsystem in two languages (as is the case for the Mercury debugger) then we generally divide that conceptual subsystem into different subdirectories for each language.

The subdirectories containing major subsystems are as follows:

boehm_gc: the Boehm (et al.) conservative garbage collector (written in C)
runtime: the Mercury runtime system (written in C)
library: the Mercury standard library (written in Mercury)
compiler: the Mercury compiler (written in Mercury)
trace: the part of the Mercury debugger that is written in C
browser: the part of the Mercury debugger that is written in Mercury
mdbcomp: the library that defines the Mercury data structures generated by the compiler for the debugger
ssdb: support code for the source-to-source debugger (written in Mercury)
slice: tools for manipulating slices and dices (written in Mercury)
profiler: the Mercury profiler (written in Mercury)
deep_profiler: the Mercury deep profiler (written in Mercury)
extras: additional Mercury libraries.

In addition, there are some extra subdirectories for scripts and utility programs:

scripts: most are shell scripts in either Bourne shell or bash, while a few are in other scripting languages such as perl.
util: utility programs (written in C)

These extra subdirectories provide the infrastructure and "glue code" that connects the major subsystems. There is also some additional infrastructure (the autoconf configuration stuff and the primary Makefile and Mmakefile) in the top-level directory.

As well as the subdirectories containing the major subsystems and the glue code, there are also some subdirectories that just provide documentation:

doc: documentation for users (mostly written in TexInfo)
samples: example programs

Finally there are some directories containing stuff that is for the developers of the Mercury implementation, rather than being part of the Mercury implementation:

tools: scripts that are useful in the development of the Mercury implementation, but which are not actually part of the end product.
compiler/notes: documentation for developers of the Mercury implementation.
tests: a big suite of test cases.

Programs, shell scripts, and file names

Often executable programs in the Mercury implementation will need to access files provided by the Mercury implementation. However, we want to avoid hard-coding path names in the executables, and Unix does not provide any reasonable way for a program to determine what directory the executable file is in.

To solve this problem, executable programs which need to know path names are never invoked directly. Instead, we always write a small shell script that acts as a front-end for the executable program (e.g. scripts/mmc is the front-end for compiler/mercury_compile). The hard-coded path names get put in the shell script, which passes them on to the program as parameters or environment variables. The shell script is itself automatically generated from a template (e.g. scripts/mmc.in) that contains symbolic names of the form @foo@; the top-level `configure' script fills in the values for these based on the user-specified parameters to the configure script. The configure script is itself generated by `autoconf' from `configure.ac'.

Libraries and dependencies

Most major subsystems (which doesn't include `extras') get compiled to a single library or executable program, though a few, such as deep_profiler, get compiled to several executables. None get compiled to more than one library.

On most systems, mutual recursion between libraries is not very well supported. On Unix, for static linking you need to list such libraries more than once on the command line. And on Windows, allowing mutual recursion between different DLLs requires some fairly major contortions.

To avoid such difficulties, and for the sake of portability to future systems which may impose similar requirements, it is a design principle of the Mercury implementation that there should be no mutual recursion between libraries.

The Mercury linker links the different components that make up a program in the following order:

the object of the auto-generated init file (generated by util/mkinit.c)
the main program object files (e.g. compiler/*.o or profiler/*.o)
trace library (trace/libmer_trace.a)
ssdb library (ssdb/libmer_ssdb.a)
browser library (browser/libmer_browser.a)
mdbcomp library (mdbcomp/libmer_mdbcomp.a)
standard library (library/libmer_std.a)
runtime library (runtime/libmer_rt.a)
Boehm collector (boehm_gc/libgc.a)

To avoid circularities, libraries cannot contain direct calls to any routines that are defined in libraries (or object files) that occur earlier in the above list. Any such calls must be made into indirect calls via function pointers. These function pointers can be initialized by the auto-generated init file, which, since it is at the start of the list, can refer to functions in any of the other components.