Next: , Previous: , Up: Compilation details   [Contents][Index]


5.3 Creating optimization files

By default, when mmc compiles a module, say module_a, the only code it has access to is the code of module_a itself. The only source of information that mmc has about the modules imported by module_a are their .int files. Beyond the definitions of types, insts and modes, these contain the declarations of predicates and functions, but not their definitions.

However, the compiler’s usual optimizations could do a better job if they did have access to the definitions of those predicates and functions. This is why the Mercury compiler has a mechanism for providing that access. This mechanism, intermodule optimization, has two faces: recording extra information about the nominally-private parts of each module in a file, and making use of that information while compiling other modules.

Commands that do the first part look like this:

mmc --make-optimization-interface module_1.m module_2.m …

This command will build module_1.opt, module_2.opt, and generally the .opt file of each named module. These files contain information that is normally private to the module that the .opt file is for, but which may be useful for the optimization of other modules. Mostly, this includes the definitions (i.e. the code) of both public and private predicates and functions of the module, if those definitions match one or more from a list of criteria, which include (but are not limited to) the following.

Beside such code, .opt files also contain definitions and declarations needed to make sense of that code, such as the definitions of the types, insts and modes they involve, and the declarations of both the predicates and functions they define. and the predicates and functions they call.

After .opt files have been created, any invocation of mmc to compile say module_1 with the --intermodule-optimization option (or --intermod-opt for short), will read in, and use, the .opt files of the modules that module_1 imports.

In some cases, when compiling e.g. module_1, an optimization would like access to information that is derived not just from a module that module_1 imports, call it module_2, but also from modules that module_2 imports, and they import, and so on. The information that these optimizations need is not so much the code of e.g. predicates defined in modules that module_2 imports, but their effect on the properties of the predicates and functions of module_2 itself.

Consider a conjunction such as

... p(...), q(...), r(...), ...

where the definition of r traverses a data structure created by p. The Mercury compiler contains an optimization that can fuse two traversals into one. The optimization is called deforestation, because it can eliminate the intermediate data structure created by p and consumed by r, and in logic programming languages, data structures are terms, which can be viewed as trees.

Deforestation can fuse two traversals only if they are next to each other, and in this case, the two calls to be fused are not next to each other. The first step is therefore to replace

... p(...), q(...), r(...), ...

with

... p(...), r(...), q(...), ...

However, this is safe only in certain circumstances.

One situation in which it is unsafe occurs when q is semidet, meaning it can fail. and r can throw an exception. This is because in this case, the reordering above can replace code that simply fails with code that throws an exception. This is an no observable effect on the execution of the program, which optimizations are not allowed to make.

To perform the above reordering, the compiler needs to know that r can never throw an exception. (It would also need to be able to rule out other situations that could cause the reordering to have an observable effect, but in this example, we are focusing on just this one.) For this, it needs to know not just that the code of r (which must be available if we are considering fusing it with the code of p) contains no code to throw an exception, but also that the same is true for the predicates and functions it calls, and the predicates and functions they call, directly or indirectly. In effect, we need to know that no predicate or function in the call tree of r can throw an exception.

ZZZ

To make this possible in at least some cases, Mercury has a mechanism to make such information available: .trans_opt files. These files contain analysis results, with compiler options specifying the set of analyses whose results they contain.

One of these analyses is exception analysis, which computes safe approximations to the set of exceptions that each predicate or function can possibly throw. These approximations do not refer to the identities of specific exceptions, but they are nevertheless useful, because if this approximation is the empty set, then we know for sure that the predicate or function cannot throw any exception. (An approximation can overestimate the set of actions that the predicate or function may perform, but it is safe only if it will never underestimate that set.)

Consider a call chain between functions where f calls g, g calls h, and h calls i. with f, g, h, i being defined in module_f, module_g, module_h and module_i respectively. Suppose none of these functions contain any calls other than the ones listed here, and none of these modules contain anything else. In that case,

These dependencies transfer to the files involved:

While we can build each .opt file independently of any other .opt file, this is not true for .trans_opt files. Not only is it the case that .trans_opt files can depend on other .trans_opt files, that in fact is the reason for their existence. In this case, if e.g. module_f.trans_opt records that f cannot throw any exception, it can do so only because module_i.trans_opt, module_h.trans_opt, and module_g.trans_opt all record that g, h and i respectively also have this property. And if code that can possibly throw exceptions is ever added to any one of module_i.m, module_h.m, and module_g.m, then both the .trans_opt file of the updated module, and all the .trans_opt files that depend on it, must all be rebuilt.

If the programmer adds a function named f2 to module_f.m and a function named h2 to module_h.m, with h2 calling f2, then the compiler can fully analyze h2 and put the results into module_h.trans_opt only if, when it is building module_h.trans_opt, it has access to the contents of module_f.trans_opt. However, we cannot make module_h.trans_opt depend on module_f.trans_opt. module_f.trans_opt already depends on module_h.trans_opt, because if we did that, we would make the dependency chain circular. This would mean that in order bring e.g. module_h.trans_opt up to date, we would first have to bring module_f.trans_opt up to date, which means that we would first have to bring module_h.trans_opt up to date, which means that we would first have to bring module_f.trans_opt up to date, and so on forever. This is why software build systems, including mmake and mmc --make, require the absence of circular dependencies between files.

Mercury’s first step towards avoiding circular dependencies between .trans_opt files is to impose an order on the modules of the program when the mmc --generate-dependencies command is executed for it. This order is recorded in the .dv The second step is to ensure that a .trans_opt file can only ever depend on .trans_opt files that follow it in the order, never ones that precede it.

ZZZ TODO

mmc --make-transitive-optimization-interface module_1.m module_2.m …

Next: , Previous: , Up: Compilation details   [Contents][Index]