Next: , Previous: parsing_utils, Up: Top   [Contents]


61 pprint

%--------------------------------------------------%
% vim:ts=4 sw=4 expandtab ft=mercury
%--------------------------------------------------%
% Copyright (C) 2000-2007, 2010-2011 The University of Melbourne
% Copyright (C) 2014-2018, 2020, 2022 The Mercury team.
% This file is distributed under the terms specified in COPYING.LIB.
%--------------------------------------------------%
%
% File: pprint.m
% Main author: rafe
% Stability: medium
%
% NOTE: this module has now been superceded by pretty_printer.m, which
%
% - is more economical,
% - produces better output,
% - has better control over the amount of output produced, and
% - supports user-specifiable formatting for arbitrary types.
%
% ABOUT
% -----
%
% This started off as pretty much a direct transliteration of Philip Wadler's
% Haskell pretty printer described in "A Prettier Printer", available at
% http://cm.bell-labs.com/cm/cs/who/wadler/topics/recent.html
%
% Several changes have been made to the algorithm to preserve linear running
% time under a strict language, and to ensure scalability to extremely large
% terms without thrashing the VM system.
%
% Wadler's approach has three main advantages:
%
% 1. the layout algebra is small and quite intuitive (more so than Hughes');
% 2. the pretty printer is optimal in the sense that it will never generate
%    output that over-runs the specified width unless that is unavoidable; and
% 3. the pretty printer is bounded in that it never needs to look more than
%    k characters ahead to make a formatting decision.
%
% I have made the following changes:
%
% (a) rather than having group/1 as a non-primitive function (for allowing
% line-breaks to be converted into spaces at the pretty printer's discretion)
% over docs, I have extended the doc type to include a `GROUP' constructor,
% and made the appropriate algorithmic changes. Because `UNION' only arises
% as a consequence of processing a 'GROUP' it turns out to be simpler
% to do away with `UNION' altogether and convert clauses that process
% `UNION' terms to processing `GROUP's.
%
% (b) Flattened `line' breaks become empty strings rather than spaces.
%
% (c) The third change is the introduction of the `LABEL' constructor,
% which acts much like `NEST', except that indentation is defined
% using a string rather than a number of spaces. This is useful for,
% e.g., multi-line compiler errors and warnings that should be prefixed
% with the offending source file and line number.
%
% (d) The formatting decision procedure has been altered to preserve
% linear runtime behaviour in a strict language.
%
% (e) Naively marking up a term as a doc has the drawback that the
% resulting doc is significantly larger than the original term.
% Worse, any sharing structure in the original term leads to
% duplicated sub-docs, which can cause an exponential blow-up in the
% size of the doc w.r.t. the source term. To get around this problem
% I have introduced the 'DOC' constructor which causes on-demand
% conversion of arguments.
%
% [This is not true laziness in the sense that the 'DOC', once
% evaluated, will be overwritten with its value. This approach would
% lead to garbage retention and not solve the page thrashing behaviour
% otherwise experienced when converting extremely large terms.
% Instead, each 'DOC' is reevaluated each time it is examined.
% This trades off computation time for space.]
%
% I have added several obvious general purpose formatting functions.
%
%
% USAGE
% -----
%
% There are two stages in pretty printing an object of some type T:
% 1. convert the object to a pprint.doc using the constructor functions
%    described below or by simply calling pprint.to_doc/[1,2];
% 2. call pprint.write/[4,5] or pprint.to_string/2 passing the display width
%    and the doc.
%
%
% EXAMPLES
% --------
%
% The doc/1 type class has types string, char, int, float and doc as instances.
% Hence these types can all be converted to docs by applying doc/1.
% This happens automatically to the arguments of ++/2. Users may find it
% convenient to add other types as instances of the doc/1 type class.
%
% Below are some docs followed by the ways they might be displayed by the
% pretty printer given various line widths.
%
% 1. "Hello " ++ line ++ "world"
%
%   Hello
%   world
%
% 2. group("Hello " ++ line ++ "world")
%
%   Hello world
%
%   Hello
%   world
%
% 3. group("Hello " ++ nest(3, line ++ "world"))
%
%   Hello world
%
%   Hello
%      world
%
% 4. group("Goodbye " ++ nest(3, line ++ "cruel " ++ line ++ "world")
%
%   Goodbye cruel world
%
%   Goodbye
%      cruel
%      world
%
% 5. group("Goodbye " ++ nest(3, line ++ group("cruel " ++ line ++ "world")))
%
%   Goodbye cruel world
%
%   Goodbye
%      cruel world
%
%   Goodbye
%      cruel
%      world
%
% 6. label("Look! ", line ++
%        group("Goodbye " ++
%              nest(3, line ++ group("cruel " ++ line ++ "world"))))
%
%   Look! Goodbye cruel world
%
%   Look! Goodbye
%   Look!    cruel world
%
%   Look! Goodbye
%   Look!    cruel
%   Look!    world
%
%--------------------------------------------------%
%--------------------------------------------------%

:- module pprint.
:- interface.

:- import_module char.
:- import_module io.
:- import_module list.
:- import_module stream.
:- import_module string.
:- import_module univ.

%--------------------------------------------------%

    % Clients must translate data structures into docs for
    % the pretty printer to display.
    %
:- type doc.

    % This typeclass can be used to simplify the construction of docs.
    %
:- typeclass doc(T) where [
    % Convert a T to a doc, placing a limit on how much of the term
    % will be fully converted as follows:
    %
    % doc(_, f         ) = f
    % doc(N, f(A, B, C)) = f/3 if N =< 0
    % doc(N, f(A, B, C)) = some representation of the term whereby
    %   A is converted as doc(N - 1, A),
    %   B is converted as doc(N - 2, B), and
    %   C is converted as doc(N - 3, C)
    %   - if there are more than N arguments, the N+1th and subsequent
    %     arguments should be replaced with a single ellipsis.
    %
    func doc(int, T) = doc
].

:- instance doc(doc).
:- instance doc(string).
:- instance doc(int).
:- instance doc(int8).
:- instance doc(int16).
:- instance doc(int32).
:- instance doc(int64).
:- instance doc(uint).
:- instance doc(uint8).
:- instance doc(uint16).
:- instance doc(uint32).
:- instance doc(uint64).
:- instance doc(float).
:- instance doc(char).

%--------------------------------------------------%
%
% Creating a doc.
%

    % The empty document corresponding to the null string.
    %
:- func nil = doc.

    % The document consisting of a single string.
    %
    % NOTE: since string is now an instance of the doc/1
    % type class, it is simpler to just apply the doc/1
    % method.
    %
:- func text(string) = doc.

    % This function can be used to convert strings, chars, ints, uints and
    % floats to their text doc equivalents.
    %
    % NOTE: since these types are now instances of the doc/1 type class,
    % it is simpler to just apply the doc/1 method to these types.
    %
:- func poly(poly_type) = doc.

    % The newline document. In a group doc (see below) the pretty printer
    % may choose to instead `flatten' all line docs into nil docs in order
    % to fit a doc on a single line.
    %
:- func line = doc.

    % Handy punctuation docs, and versions with following spaces and/or
    % line breaks.
    %
:- func comma            = doc.
:- func semic            = doc.      % Semicolon.
:- func colon            = doc.
:- func space            = doc.
:- func comma_space      = doc.
:- func semic_space      = doc.
:- func colon_space      = doc.
:- func comma_line       = doc.
:- func semic_line       = doc.
:- func colon_line       = doc.
:- func space_line       = doc.
:- func comma_space_line = doc.
:- func semic_space_line = doc.
:- func colon_space_line = doc.
:- func ellipsis         = doc.      % "...".

    % Convert an instance of the doc/1 typeclass to a doc.
    %
:- func doc(T) = doc <= (doc(T)).

    % Convert an arbitrary term to a doc. This requires std_util.functor/3
    % to work on all components of the object being converted. The second
    % version places a maximum depth on terms which are otherwise truncated
    % in the manner described in the documentation for the doc/2 method
    % of the doc/1 type class.
    %
    % This may throw an exception or cause a runtime abort if the term
    % in question has user-defined equality.
    %
:- func to_doc(T)      = doc.
:- func to_doc(int, T) = doc.

%--------------------------------------------------%
%
% Manipulating docs.
%

    % The composition of two docs with no intervening space.
    %
    % NOTE: with the addition of the doc/1 type class, it is
    % simpler to construct compound docs using ++/2.
    %
:- func doc `<>` doc = doc.

    % An alternative to the <>/2 concatenation operator that works
    % on members of the doc/1 typeclass.
    %
:- func T1 ++ T2 = doc <= (doc(T1), doc(T2)).

    % Any `line' docs in the body that are not flattened out by the
    % pretty printer are followed by the given number of spaces
    % (nested `nest's add up).
    %
:- func nest(int, T) = doc <= (doc(T)).

    % Identical to a nest doc except that indentation is extended with
    % a string label rather than some number of spaces.
    %
:- func label(string, T) = doc <= (doc(T)).

    % A group doc gives the pretty printer a choice: if the doc can be printed
    % without line wrapping then it does so (all line, label, nest and group
    % directives within the group are ignored); otherwise the pretty printer
    % treats the group body literally, although nested group docs remain as
    % choice points.
    %
:- func group(T) = doc <= (doc(T)).

    % Shorthand for doc ++ line ++ doc.
    %
:- func doc `</>` doc = doc.

    % Various bracketing functions.
    %
    %   bracketed(L, R, Doc) = L ++ Doc ++ R
    %       parentheses(Doc) = bracketed("(", ")", Doc)
    %          brackets(Doc) = bracketed("[", "]", Doc)
    %            braces(Doc) = bracketed("{", "}", Doc)
    %
:- func bracketed(T1, T2, T3) = doc <= (doc(T1), doc(T2), doc(T3)).
:- func parentheses(T)        = doc <= (doc(T)).
:- func brackets(T)           = doc <= (doc(T)).
:- func braces(T)             = doc <= (doc(T)).

    % packed(Sep, [X1, X2, .., Xn]) = G1 `<>` G2 `<>` .. `<>` Gn where
    % Gi = group(line `<>` Xi `<>` Sep), except for Gn where
    % Gn = group(line `<>` Xn).
    %
    % For the singleton list case, packed(Sep, [X]) = group(line `<>` X).
    %
    % The resulting doc tries to pack as many items on a line as possible.
    %
:- func packed(T1, list(T2)) = doc <= (doc(T1), doc(T2)).

    % A variant of the above whereby only the first N elements of the list
    % are formatted and the rest are replaced by a single ellipsis.
    %
:- func packed(int, T1, list(T2)) = doc <= (doc(T1), doc(T2)).

    % packed_cs(Xs) = packed(comma_space, Xs).
    %
    % For example, to pretty print a Mercury list of docs one might use
    %
    %   brackets(nest(2, packed_cs(Xs)))
    %
:- func packed_cs(list(T)) = doc <= (doc(T)).

    % A variant of the above whereby only the first N elements of the list
    % are formatted and the rest are replaced by a single ellipsis.
    %
:- func packed_cs(int, list(T)) = doc <= (doc(T)).

    % This is like a depth-limited version of packed_cs/1 that first calls
    % to_doc/2 on each member of the argument list.
    %
:- func packed_cs_to_depth(int, list(T)) = doc.

    % This is like a version of packed_cs_to_depth/1 that first calls
    % univ_value/1 for each member of the argument list.
    %
:- func packed_cs_univ_args(int, list(univ)) = doc.

    % separated(PP, Sep, [X1,...,Xn]) =
    %   PP(X1) `<>` Sep `<>` ... Sep `<>` PP(Xn)
    %
:- func separated(func(T1) = doc, T2, list(T1)) = doc <= (doc(T2)).

    % Performs word wrapping at the end of line, taking whitespace sequences
    % as delimiters separating words.
    %
    % See `char.is_whitespace' for the definition of whitespace characters
    % used by this predicate.
    %
:- func word_wrapped(string) = doc.

%--------------------------------------------------%
%
% Prettyprint a doc.
%

    % Convert docs to pretty printed strings. The int argument specifies
    % a line width in characters.
    %
:- func to_string(int, doc) = string.

    % Write the given doc out in its pretty printed format to the
    % current output stream.
    %
    % The int argument specifies the preferred maximum length of each line.
    % (This preferred maximum can be exceeded, e.g. if a single text string
    % to be printed is longer than this width.)
    %
:- pred write(int::in, T::in, io::di, io::uo) is det <= doc(T).

    % Write the specified doc to the specified string writer stream
    % in its pretty printed format.
    %
    % The int argument specifies the preferred maximum length of each line.
    % (This preferred maximum can be exceeded, e.g. if a single text string
    % to be printed is longer than this width.)
    %
:- pred write(Stream::in, int::in, T::in, State::di, State::uo) is det
    <= ( doc(T), stream.writer(Stream, string, State) ).

%--------------------------------------------------%
%--------------------------------------------------%


Next: , Previous: parsing_utils, Up: Top   [Contents]