Next: User-defined equality and comparison, Previous: Unique modes, Up: Top [Contents]
For each mode of a predicate or function, we categorise that mode according to how many times it can succeed, and whether or not it can fail before producing its first solution.
If all possible calls to a particular mode of a predicate or function which return to the caller (calls which terminate, do not throw an exception and do not cause a fatal runtime error)
det
);
semidet
);
multi
);
nondet
);
failure
.
If no possible calls to a particular mode of a predicate or function
can return to the caller,
then that mode has a determinism of erroneous
.
The determinism annotation erroneous
is used on the library predicates
‘require.error/1’ and ‘exception.throw/1’,
but apart from that,
determinism annotations erroneous
and failure
are generally not needed.
To summarize:
Maximum number of solutions Can fail? 0 1 > 1 no erroneous det multi yes failure semidet nondet
(Note: the “Can fail?” column here indicates only whether the procedure
can fail before producing at least one solution;
attempts to find a second solution to a particular call,
e.g. for a procedure with determinism multi
,
are always allowed to fail.)
The determinism of each mode of a predicate or function is indicated by an annotation on the mode declaration. For example:
:- pred append(list(T), list(T), list(T)). :- mode append(in, in, out) is det. :- mode append(out, out, in) is multi. :- mode append(in, in, in) is semidet. :- func length(list(T)) = int. :- mode length(in) = out is det. :- mode length(in(list_skel)) = out is det. :- mode length(in) = in is semidet.
An annotation of det
or multi
is an assertion
that for every value each of the inputs,
there exists at least one value of the outputs
for which the predicate is true,
or (in the case of functions)
for which the function term is equal to the result term.
Conversely, an annotation of det
or semidet
is an assertion
that for every value each of the inputs,
there exists at most one value of the outputs for which the predicate is true,
or (in the case of functions)
for which the function term is equal to the result term.
These assertions are called the mode-determinism assertions;
aside from assisting in reasoning about the code,
they may allow an implementation to perform
optimizations that would not otherwise be allowed,
such as optimizing away a goal with no outputs
even though it might throw an exception (see Exception handling),
contain a trace goal (see Trace goals),
or infinitely loop.
In some cases these optimizations may not be desirable;
the strict sequential semantics guarantees that they will not be performed
(see Formal semantics).
If the mode of the predicate is given in the :- pred
declaration
rather than in a separate :- mode
declaration,
then the determinism annotation goes on the :- pred
declaration
(and similarly for functions).
In particular, this is necessary
if a predicate does not have any argument variables.
If the determinism declaration is given on a :- func
declaration
without the mode, the function is assumed to have the default mode
(see Modes for more information on default modes of functions).
For example:
:- pred loop(int::in) is erroneous. loop(X) :- loop(X). :- pred p is det. p. :- pred q is failure. q :- fail.
If there is no mode declaration for a function,
then the default mode for that function
is considered to have been declared as det
.
If you want to write a partial function,
i.e. one whose determinism is semidet
,
then you must explicitly declare the mode and determinism.
In Mercury, a function is supposed to be
a true mathematical function of its arguments;
that is, the value of the function’s result
should be determined only by the values of its arguments.
Hence, for any mode of a function
that specifies that all the arguments are fully input
(i.e. for which the initial inst of all the arguments is a ground inst),
the determinism of that mode can only be
det
, semidet
, erroneous
, or failure
.
The determinism categories form this lattice:
erroneous / \ failure det \ / \ semidet multi \ / nondet
The higher up this lattice a determinism category is, the more the compiler knows about the number of solutions of procedures of that determinism.
The determinism of goals is inferred from the determinism of their component parts, according to the rules below. The inferred determinism of a procedure is just the inferred determinism of the procedure’s body.
For procedures that are local to a module,
the determinism annotations may be omitted;
in that case, their determinism will be inferred.
(To be precise, the determinism of procedures without a determinism annotation
is defined as the least fixpoint of the transformation which,
given an initial assignment
of the determinism det
to all such procedures,
applies those rules to infer
a new determinism assignment for those procedures.)
It is an error to omit the determinism annotation for procedures that are exported from their containing module.
If a determinism annotation is supplied for a procedure, the declared determinism is compared against the inferred determinism. If the declared determinism is greater than or not comparable to the inferred determinism (in the partial ordering above), it is an error. If the declared determinism is less than the inferred determinism, it is not an error, but the implementation may issue a warning.
The determinism category of each goal is inferred according to the following rules. These rules work with the two components of determinism categories: whether the goal can fail without producing a solution, and the maximum number of solutions of the goal (0, 1, or more). If the inference process below reports that a goal can succeed more than once, but the goal generates no outputs that are visible from outside the goal, and the goal is not impure (see Impurity), then the final determinism of the goal will be based on the goal succeeding at most once, since the compiler will implicitly prune away any duplicate solutions.
The determinism category of a call is the determinism declared or inferred for the called mode of the called procedure.
The determinism of a unification
is either det
, semidet
, or failure
,
depending on its mode.
A unification that assigns the value of one variable to another is deterministic. A unification that constructs a structure and assigns it to a variable is also deterministic. A unification that tests whether a variable has a given top function symbol is semideterministic, unless the compiler knows the top function symbol of that variable, in which case its determinism is either det or failure depending on whether the two function symbols are the same or not. A unification that tests two variables for equality is semideterministic, unless the compiler knows that the two variables are aliases for one another, in which case the unification is deterministic, or unless the compiler knows that the two variables have different function symbols in the same position, in which case the unification has a determinism of failure.
The compiler knows the top function symbol of a variable if the previous part of the procedure definition contains a unification of the variable with a function symbol, or if the variable’s type has only one function symbol.
The determinism of the empty conjunction (the goal ‘true’)
is det
.
The conjunction ‘(A, B)’ can fail
if either A can fail, or if A can succeed at least once,
and B can fail.
The conjunction can succeed at most zero times
if either A or B can succeed at most zero times.
The conjunction can succeed more than once
if either A or B can succeed more than once
and both A and B can succeed at least once.
(If e.g. A can succeed at most zero times,
then even if B can succeed many times
the maximum number of solutions of the conjunction is still zero.)
Otherwise, i.e. if both A and B succeed at most once,
the conjunction can succeed at most once.
A disjunction is a switch if each disjunct has near its start a unification that tests the same bound variable against a different function symbol. For example, consider the common pattern
( L = [], empty(Out) ; L = [H|T], nonempty(H, T, Out) )
If L
is input to the disjunction,
then the disjunction is a switch on L
.
If two variables are unified with each other,
then whatever function symbol one variable is unified with,
the other variable is considered to be unified with the same function symbol.
In the following example,
since K
is unified with L
,
the second disjunct unifies L
as well as K
with cons,
and thus the disjunction is recognized as a switch.
( L = [], empty(Out) ; K = L, K = [H|T], nonempty(H, T, Out) )
A switch can fail if the various arms of the switch do not cover all the function symbols in the type of the switched-on variable, or if the code in some arms of the switch can fail, bearing in mind that in each arm of the switch, the unification that tests the switched-on variable against the function symbol of that arm is considered to be deterministic. A switch can succeed several times if some arms of the switch can succeed several times, possibly because there are multiple disjuncts that test the switched-on variable against the same function symbol. A switch can succeed at most zero times only if all the reachable arms of the switch can succeed at most zero times. (A switch arm is not reachable if it unifies the switched-on variable with a function symbol that is ruled out by that variable’s initial instantiation state.)
Only unifications may occur before the test of the switched-on variable in each disjunct. Tests of the switched-on variable may occur within existential quantification goals.
The following example is a switch.
( Out = 1, L = [] ; some [H, T] ( L = [H|T], nonempty(H, T, Out) ) )
The following example is not a switch because the call in the first disjunct occurs before the test of the switched-on variable.
( empty(Out), L = [] ; L = [H|T], nonempty(H, T, Out) )
The unification of the switched-on variable with a function symbol
may occur inside a nested disjunction in a given disjunct,
provided that unification is preceded only by other unifications,
both inside the nested disjunction and before the nested disjunction.
The following example is a switch on X
,
provided X
is bound beforehand.
( X = f p(Out) ; Y = X, ( Y = g, Intermediate = 42 ; Z = Y, Z = h(Arg), q(Arg, Intermediate) ), r(Intermediate, Out) )
The determinism of the empty disjunction (the goal ‘fail’)
is failure
.
A disjunction ‘(A ; B)’ that is not a switch
can fail if both A and B can fail.
It can succeed at most zero times
if both A and B can succeed at most zero times.
It can succeed at most once
if one of A and B can succeed at most once
and the other can succeed at most zero times.
Otherwise, i.e. if either A or B can succeed more than once,
or if both A and B can succeed at least once,
it can succeed more than once.
If the condition of an if-then-else cannot fail, the if-then-else is equivalent to the conjunction of the condition and the “then” part, and its determinism is computed accordingly. Otherwise, an if-then-else can fail if either the “then” part or the “else” part can fail. It can succeed at most zero times if the “else” part can succeed at most zero times and if at least one of the condition and the “then” part can succeed at most zero times. It can succeed more than once if any one of the condition, the “then” part and the “else” part can succeed more than once.
If the determinism of the negated goal is erroneous
,
then the determinism of the negation is erroneous
.
If the determinism of the negated goal is failure
,
the determinism of the negation is det
.
If the determinism of the negated goal is det
or multi
,
the determinism of the negation is failure
.
Otherwise, the determinism of the negation is semidet
.
Note that “perfect” determinism inference is an undecidable problem, because it requires solving the halting problem. (For instance, in the following example
:- pred p(T, T). :- mode p(in, out) is det. p(A, B) :- ( something_complicated(A, B) ; B = A ).
‘p/2’ can have more than one solution
only if ‘something_complicated/2’ can succeed.)
Sometimes,
the rules specified by the Mercury language for determinism inference
will infer a determinism that is not as precise as you would like.
However, it is generally easy to overcome such problems.
The way to do this is to replace the compiler’s static checking
with some manual run-time checking.
For example, if you know that a particular goal should never fail,
but the compiler infers that goal to be semidet
,
you can check at runtime that the goal does succeed,
and if it fails, call the library predicate ‘error/1’.
:- pred q(T, T). :- mode q(in, out) is det. q(A, B) :- ( if goal_that_should_never_fail(A, B0) then B = B0 else error("goal_that_should_never_fail failed!") ).
The predicate error/1
has determinism erroneous
,
which means the compiler knows that it will never succeed or fail,
so the inferred determinism for the body of q/2
is det
.
(Checking assumptions like this is good coding style anyway.
The small amount of up-front work that Mercury requires
is paid back in reduced debugging time.)
Mercury’s mode analysis knows that
computations with determinism erroneous
can never succeed,
which is why it does not require the “else” part
to generate a value for B
.
The introduction of the new variable B0
is necessary
because the condition of an if-then-else is a negated context,
and can export the values it generates
only to the “then” part of the if-then-else,
not directly to the surrounding computation.
(If the surrounding computations had direct access
to values generated in conditions,
they might access them even if the condition failed.)
Normally, attempting to call
a nondet
or multi
mode of a predicate
from a predicate declared as semidet
or det
will cause a determinism error.
So how can we call nondeterministic code from deterministic code?
There are several alternative possibilities.
If you just want to see if a nondeterministic goal is satisfiable or not,
without needing to know what variable bindings it produces,
then there is no problem -
determinism analysis considers nondet
and multi
goals
with no non-local output variables to be
semidet
and det
respectively.
If you want to use the values of output variables, then you need to ask yourself which one of possibly many solutions to a goal do you want? If you want all of them, you need to one of the predicates in the standard library module ‘solutions’, such as ‘solutions/2’ itself, which collects all of the solutions to a goal into a list — see Higher-order.
If you just want one solution from a predicate and don’t care which,
you should declare the relevant mode of the predicate
to have determinism cc_nondet
or cc_multi
(depending on whether you are guaranteed at least one solution or not).
This tells the compiler that this mode of this predicate
may have more than one solution when viewed as a statement in logic,
but the implementation should stop after generating the first solution.
In other words, the implementation should commit to the first solution.
The commit to the first solution means that
a piece of cc_nondet
or cc_multi
code
can never be asked to generate a second solution.
If e.g. a cc_nondet
call is in a conjunction,
then no later goal in that conjunction (after mode reordering) may fail,
because that would ask the committed choice goal for a second solution.
The compiler enforces this rule.
In the declarative semantics, which solution will be the first is unpredictable, but in the operational semantics, you can predict which solution will be the first, since Mercury does depth-first search with left-to-right execution of clause bodies, though that is not on the source code form of each clause body, but on its form after mode analysis reordering to put the producer of each variable before all its consumers.
The ‘committed choice nondeterminism’ of a predicate
has to be propagated up the call tree,
making its callers, its callers’ callers, and so on,
also cc_nondet
or cc_multi
,
until either you get to main/2
at the top of the call tree,
or you get to a location where you don’t have to propagate
the committed choice context upward anymore.
While main/2
is usually declared to have determinism det
,
it may also be declared cc_multi
.
In the latter case,
while the program’s declarative semantics may admit more than one solution,
the implementation will stop after the first,
so alternative solutions to main/2
(and hence also to cc_nondet
or cc_multi
predicates
called directly or indirectly from ‘main/2’)
are implicitly pruned away.
This is similar to the “don’t care” nondeterminism
of committed choice logic programming languages such as GHC.
One way to stop propagating committed choice nondeterminism
is the one mentioned above: if a goal has no non-local output variables
(i.e. none of the goal’s outputs are visible from outside the goal),
then the goal’s solutions are indistinguishable from the outside,
and the implementation will only attempt to satisfy the goal once,
whether or not the goal is committed choice.
Therefore if a cc_multi
goal has all its outputs ignored,
then the compiler considers it to be a det
goal,
while if a cc_nondet
goal has all its outputs ignored,
then the compiler considers it to be a semidet
goal.
The other way to stop propagating committed choice nondeterminism is applicable when you know that all the solutions returned will be equivalent in all the ways that your program cares about. For example, you might want to find the maximum element in a set by iterating over the elements in the set. Iterating over the elements in a set in an unspecified order is a nondeterministic operation, but no matter which order you iterate over them, the maximum value in the set should be the same.
If this condition is satisfied,
i.e. if you know that there will only ever be at most one distinct solution
under your equality theory of the output variables,
then you can use a ‘promise_equivalent_solutions’ determinism cast.
If goal ‘G’ is a cc_multi
goal
whose outputs are X
and Y
, then
promise_equivalent_solutions [X, Y] ( G )
promises the compiler that all solutions of G
are equivalent,
so that regardless of which solution of G
the implementation happens to commit to,
the rest of the program will compute
either identical or (similarly) equivalent results.
This allows the compiler to consider
promise_equivalent_solutions [X, Y] ( G )
to have determinism det
.
Likewise, the compiler will consider
promise_equivalent_solutions [X, Y] ( G )
where G
is cc_nondet
to have determinism semidet
.
Note that specifying a user-defined equivalence relation as the equality predicate for user-defined types (see User-defined equality and comparison) means that ‘promise_equivalent_solutions’ can be used to express more general forms of equivalence. For example, if you define a set type which represents sets as unsorted lists, you would want to define a user-defined equivalence relation for that type, which could sort the lists before comparing them. The ‘promise_equivalent_solutions’ determinism cast could then be used for sets even though the lists used to represent the sets might not be in the same order in every solution.
In addition to the determinism annotations described earlier,
there are “committed choice” versions of multi
and nondet
,
called cc_multi
and cc_nondet
.
These can be used instead of multi
or nondet
if all calls to that mode of the predicate (or function)
occur in a context in which only one solution is needed.
Such single-solution contexts are determined as follows.
cc_multi
or cc_nondet
is in a single-solution context.
For example, the program entry point ‘main/2’
may be declared cc_multi
,
and in that case the clauses for main
are in a single-solution context.
The compiler will check that all calls to a committed-choice mode of a predicate (or function) do indeed occur in a single-solution context.
You can declare two different modes of a predicate (or function)
which differ only in “cc-ness”
(i.e. one being multi
and the other cc_multi
,
or one being nondet
and the other cc_nondet
).
In that case,
the compiler will select the appropriate one for each call
depending on whether the call comes from a single-solution context or not.
Calls from single-solution contexts will call the committed choice version,
while calls which are not from single-solution contexts
will call the backtracking version.
There are several reasons to use committed choice determinism annotations.
One reason is for efficiency:
committed choice annotations allow the compiler
to generate much more efficient code.
Another reason is for doing I/O,
which is allowed only in det
or cc_multi
predicates,
not in multi
predicates.
Another is for dealing with types that use non-canonical representations
(see User-defined equality and comparison).
And there are a variety of other applications.
Previous: Interfacing nondeterministic code with the real world, Up: Determinism [Contents]