Creating optimization files (The Mercury User’s Guide)

5.2 Creating optimization files

Intermodule optimization

By default, when mmc compiles a module, say module_a, the only code it has access to is the code of module_a itself. The only source of information that mmc has about the modules imported by module_a are their .int files. Beyond the definitions of types, insts and modes, these contain the declarations of predicates and functions, but not their definitions.

However, the compiler’s usual optimizations could do a better job if they did have access to the definitions of those predicates and functions. This is why the Mercury compiler has a mechanism for providing that access. This mechanism, intermodule optimization, has two faces: recording extra information about the nominally-private parts of each module in a file, and making use of that information while compiling other modules.

Commands that do the first part look like this:

mmc --make-optimization-interface module_1.m module_2.m …

This command will build module_1.opt, module_2.opt, and generally the .opt file of each named module. These files contain information that is normally private to the module that the .opt file is for, but which may be useful for the optimization of other modules. Mostly, this includes the definitions (i.e. the code) of both public and private predicates and functions of the module, if those definitions match one or more from a list of criteria, which include (but are not limited to) the following.

Predicates and function definitions that are so simple that inlining calls to them (meaning replacing the call with an appropriately-renamed copy of the callee’s definition) is likely to result in a speedup.
Predicates and function definitions that contain switches on the values of arguments, meaning that after inlining calls to them at call sites that know the values of those arguments, the switch can be eliminated.
Predicates and function definitions that have higher-order arguments, meaning that after inlining calls to them at call sites that know the values of those higher order arguments, the higher-order calls in the inlined version can be replaced by first order calls.

Beside such code, .opt files also contain definitions and declarations needed to make sense of that code, such as the definitions of the types, insts and modes they involve, and the declarations of both the predicates and functions they define. and the predicates and functions they call.

After .opt files have been created, any invocation of mmc to compile say module_1 with the --intermodule-optimization option (or --intermod-opt for short), will read in, and use, the .opt files of the modules that module_1 imports.

Transitive intermodule optimization

In some cases, when compiling e.g. module_1, an optimization would like access to information that is derived not just from a module that module_1 imports, call it module_2, but also from modules that module_2 imports, and they import, and so on. The information that these optimizations need is not so much the code of e.g. predicates defined in modules that module_2 imports, but their effect on the properties of the predicates and functions of module_2 itself.

Consider a conjunction such as

... p(...), q(...), r(...), ...

where the definition of r traverses a data structure created by p. The Mercury compiler contains an optimization that can fuse two traversals into one. The optimization is called deforestation, because it can eliminate the intermediate data structure created by p and consumed by r, and in logic programming languages, data structures are terms, which can be viewed as trees.

Deforestation can fuse two traversals only if they are next to each other, and in this case, the two calls to be fused are not next to each other. The first step is therefore to replace

... p(...), q(...), r(...), ...

with

... p(...), r(...), q(...), ...

However, this is safe only in certain circumstances.

One situation in which it is unsafe occurs when q is semidet, meaning it can fail. and r can throw an exception. This is because in this case, the reordering above can replace code that simply fails with code that throws an exception. This is an no observable effect on the execution of the program, which optimizations are not allowed to make.

To perform the above reordering, the compiler needs to know that r can never throw an exception. (It would also need to be able to rule out other situations that could cause the reordering to have an observable effect, but in this example, we are focusing on just this one.) For this, it needs to know not just that the code of r (which must be available if we are considering fusing it with the code of p) contains no code to throw an exception, but also that the same is true for the predicates and functions it calls, and the predicates and functions they call, directly or indirectly. In effect, we need to know that no predicate or function in the call tree of r can throw an exception.

To make this possible in at least some cases, Mercury has a mechanism to make such information available: .trans_opt files. These files contain analysis results, with compiler options specifying the set of analyses whose results they contain. Note that mmc --make does not support .trans_opt files; only mmake does.

One of these analyses is exception analysis, which computes safe approximations to the set of exceptions that each predicate or function can possibly throw. These approximations do not refer to the identities of specific exceptions, but they are nevertheless useful, because if this approximation is the empty set, then we know for sure that the predicate or function cannot throw any exception. (An approximation can overestimate the set of actions that the predicate or function may perform, but it is safe only if it will never underestimate that set.)

Consider a call chain between functions where f calls g, g calls h, and h calls i. with f, g, h, i being defined in module_f, module_g, module_h and module_i respectively. Suppose none of these functions contain any calls other than the ones listed here, and none of these modules contain anything else. In that case,

to know whether i can throw exceptions, we need only the code of i;
to know whether h can throw exceptions, we need the code of h and the results of the analysis for i;
to know whether g can throw exceptions, we need the code of g and the results of the analysis for h;
to know whether f can throw exceptions, we need the code of f and the results of the analysis for g.

These dependencies transfer to the files involved:

We record the result of exception analysis for i in module_i.trans_opt. Creating module_i.trans_opt does not require reading any other .trans_opt files.
We record the result of exception analysis for h in module_h.trans_opt. Creating module_h.trans_opt requires reading module_i.trans_opt.
We record the result of exception analysis for g in module_g.trans_opt. Creating module_g.trans_opt requires reading module_h.trans_opt, which makes module_g.trans_opt depends on module_h.trans_opt directly, and on module_i.trans_opt indirectly.
We record the result of exception analysis for f in module_f.trans_opt. Creating module_f.trans_opt requires reading module_g.trans_opt, which makes module_f.trans_opt depends on module_g.trans_opt directly, and on module_h.trans_opt and module_i.trans_opt indirectly.

While we can build each .opt file independently of any other .opt file, this is not true for .trans_opt files. Not only is it the case that .trans_opt files can depend on other .trans_opt files, that in fact is the reason for their existence. In this case, if e.g. module_f.trans_opt records that f cannot throw any exception, it can do so only because module_i.trans_opt, module_h.trans_opt, and module_g.trans_opt all record that g, h and i respectively also have this property. And if code that can possibly throw exceptions is ever added to any one of module_i.m, module_h.m, and module_g.m, then both the .trans_opt file of the updated module, and all the .trans_opt files that depend on it, must all be rebuilt.

If the programmer adds a function named f2 to module_f.m and a function named h2 to module_h.m, with h2 calling f2, then the compiler can fully analyze h2 and put the results into module_h.trans_opt only if, when it is building module_h.trans_opt, it has access to the contents of module_f.trans_opt. However, we cannot make module_h.trans_opt depend on module_f.trans_opt. module_f.trans_opt already depends on module_h.trans_opt, because if we did that, we would make the dependency chain circular. This would mean that in order to bring e.g. module_h.trans_opt up to date, we would first have to bring module_f.trans_opt up to date, which means that we would first have to bring module_h.trans_opt up to date, which means that we would first have to bring module_f.trans_opt up to date, and so on forever. This is why software build systems, including mmake and mmc --make, require the absence of circular dependencies between files.

Mercury’s first step towards avoiding circular dependencies between .trans_opt files is to impose an order on the modules of the program when the mmc --generate-dependencies command is executed for it.

Normally mmc --generate-dependencies puts the parts of the order relevant to each module in that module’s .d file (and specifically into the trans_opt_deps rule in that file). However, you can ask mmc to put the order for the whole program (e.g. prog) into e.g. prog.module_order with a command such as

mmc --generate-dependencies --also-output-module-order prog.m.

After this command, prog.module_order will contain the strongly connected components (SCCs) of the module dependency graph of the program. The SCCs will be listed in a top-down order, with different SCCs separated by blank lines, and with the different modules in each SCC being listed one per line with no blank lines between them. The order of the modules of the program is exactly the order in which appear in this file.

To actually construct the .trans_opt files for e.g. module_1.m and module_2.m, use a command such as

mmc --make-transitive-optimization-interface --analyse-exceptions module_1.m module_2.m

Beside --analyse-exceptions, the following options also record their results in .trans_opt files.

--enable-termination: This option turns on Mercury’s first termination analyser, which uses linear inequalities.
--enable-termination2: This option turns on Mercury’s second termination analyser, which uses convex constraints.
--analyse-trail-usage: This option turns on an analysis that identifies which predicates and functions definitely do not touch the trail, enabling the compiler to reduce the overhead of trailing. (This analysis is applicable only in trailing grades.)
--analyse-mm-tabling: This option turns on an analysis that identifies which predicates and functions have call trees that do not involve minimal model tabled procedures at all, enabling the compiler to reduce the overhead of minimal model tabling. (This analysis is applicable only in minimal model grades.)

The results of both termination analysis systems can be used by the user to spot potential performance problems. (That performance problem may be non-termination, but the analyser’s inability to prove termination may also indicate less drastic performance problems.) The compiler can also perform more optimizations on calls for which it knows that the called predicate and function always terminates.

After .trans_opt files containing analysis results have been created, any invocation of mmc to compile say module_3 with the --transitive-intermodule-optimization option (or --trans-intermod-opt for short), will read in, and use, a subset of the .trans_opt files of the modules that module_3 imports; specifically, the subset that come after module_3 in the module order.