Next: pqueue, Previous: parsing_utils, Up: Top [Contents]
%--------------------------------------------------% % vim:ts=4 sw=4 expandtab ft=mercury %--------------------------------------------------% % Copyright (C) 2000-2007, 2010-2011 The University of Melbourne % Copyright (C) 2014-2018, 2020 The Mercury team. % This file is distributed under the terms specified in COPYING.LIB. %--------------------------------------------------% % % File: pprint.m % Main author: rafe % Stability: medium % % NOTE: this module has now been superceded by pretty_printer.m which is % more economical, produces better output, has better control over % the amount of output produced, and supports user-specifiable formatting % for arbitrary types. % % ABOUT % ----- % % This started off as pretty much a direct transliteration of Philip Wadler's % Haskell pretty printer described in "A Prettier Printer", available at % http://cm.bell-labs.com/cm/cs/who/wadler/topics/recent.html % % Several changes have been made to the algorithm to preserve linear running % time under a strict language and to ensure scalability to extremely large % terms without thrashing the VM system. % % Wadler's approach has three main advantages: % % 1. the layout algebra is small and quite intuitive (more so than Hughes'); % 2. the pretty printer is optimal in the sense that it will never generate % output that over-runs the specified width unless that is unavoidable; and % 3. the pretty printer is bounded in that it never needs to look more than % k characters ahead to make a formatting decision. % % I have made the following changes: % % (a) rather than having group/1 as a non-primitive function (for % allowing line-breaks to be converted into spaces at the pretty % printer's discretion) over docs, I have extended the doc type to % include a `GROUP' constructor and made the appropriate algorithmic % changes. Because `UNION' only arises as a consequence of processing % a 'GROUP' it turns out to be simpler to do away with `UNION' % altogether and convert clauses that process `UNION' terms to % processing `GROUP's. % % (b) Flattened `line' breaks become empty strings rather than spaces. % % (c) The third change is the introduction of the `LABEL' constructor, % which acts much like `NEST', except that indentation is defined % using a string rather than a number of spaces. This is useful for, % e.g., multi-line compiler errors and warnings that should be % prefixed with the offending source file and line number. % % (d) The formatting decision procedure has been altered to preserve % linear runtime behaviour in a strict language. % % (e) Naively marking up a term as a doc has the drawback that the % resulting doc is significantly larger than the original term. % Worse, any sharing structure in the original term leads to % duplicated sub-docs, which can cause an exponential blow-up in the % size of the doc w.r.t. the source term. To get around this problem % I have introduced the 'DOC' constructor which causes on-demand % conversion of arguments. % % [This is not true laziness in the sense that the 'DOC', once % evaluated, will be overwritten with its value. This approach would % lead to garbage retention and not solve the page thrashing behaviour % otherwise experienced when converting extremely large terms. % Instead, each 'DOC' is reevaluated each time it is examined. % This trades off computation time for space.] % % I have added several obvious general purpose formatting functions. % % % USAGE % ----- % % There are two stages in pretty printing an object of some type T: % 1. convert the object to a pprint.doc using the constructor functions % described below or by simply calling pprint.to_doc/[1,2]; % 2. call pprint.write/[4,5] or pprint.to_string/2 passing the display width % and the doc. % % % EXAMPLES % -------- % % The doc/1 type class has types string, char, int, float and doc as instances. % Hence these types can all be converted to docs by applying doc/1. % This happens automatically to the arguments of ++/2. Users may find it % convenient to add other types as instances of the doc/1 type class. % % Below are some docs followed by the ways they might be displayed by the % pretty printer given various line widths. % % 1. "Hello " ++ line ++ "world" % % Hello % world % % 2. group("Hello " ++ line ++ "world") % % Hello world % % Hello % world % % 3. group("Hello " ++ nest(3, line ++ "world")) % % Hello world % % Hello % world % % 4. group("Goodbye " ++ nest(3, line ++ "cruel " ++ line ++ "world") % % Goodbye cruel world % % Goodbye % cruel % world % % 5. group("Goodbye " ++ nest(3, line ++ group("cruel " ++ line ++ "world"))) % % Goodbye cruel world % % Goodbye % cruel world % % Goodbye % cruel % world % % 6. label("Look! ", line ++ % group("Goodbye " ++ % nest(3, line ++ group("cruel " ++ line ++ "world")))) % % Look! Goodbye cruel world % % Look! Goodbye % Look! cruel world % % Look! Goodbye % Look! cruel % Look! world % %--------------------------------------------------% %--------------------------------------------------% :- module pprint. :- interface. :- import_module char. :- import_module io. :- import_module list. :- import_module stream. :- import_module string. :- import_module univ. %--------------------------------------------------% % Clients must translate data structures into docs for % the pretty printer to display. % :- type doc. % This typeclass can be used to simplify the construction of docs. % :- typeclass doc(T) where [ % Convert a T to a doc, placing a limit on how much of the term % will be fully converted as follows: % % doc(_, f ) = f % doc(N, f(A, B, C)) = f/3 if N =< 0 % doc(N, f(A, B, C)) = some representation of the term whereby % A is converted as doc(N - 1, A), % B is converted as doc(N - 2, B), and % C is converted as doc(N - 3, C) % - if there are more than N arguments, the N+1th and subsequent % arguments should be replaced with a single ellipsis. % func doc(int, T) = doc ]. :- instance doc(doc). :- instance doc(string). :- instance doc(int). :- instance doc(int8). :- instance doc(int16). :- instance doc(int32). :- instance doc(int64). :- instance doc(uint). :- instance doc(uint8). :- instance doc(uint16). :- instance doc(uint32). :- instance doc(uint64). :- instance doc(float). :- instance doc(char). % Fully convert an instance of doc/1. % :- func doc(T) = doc <= (doc(T)). % An alternative to the <>/2 concatenation operator that works % on members of the doc/1 typeclass. % :- func T1 ++ T2 = doc <= (doc(T1), doc(T2)). % The empty document corresponding to the null string. % :- func nil = doc. % The document consisting of a single string. % % NOTE: since string is now an instance of the doc/1 % type class, it is simpler to just apply the doc/1 % method. % :- func text(string) = doc. % The composition of two docs with no intervening space. % % NOTE: with the addition of the doc/1 type class, it is % simpler to construct compound docs using ++/2. % :- func doc `<>` doc = doc. % The newline document. In a group doc (see below) the pretty printer % may choose to instead `flatten' all line docs into nil docs in order % to fit a doc on a single line. % :- func line = doc. % Any `line' docs in the body that are not flattened out by the % pretty printer are followed by the given number of spaces % (nested `nest's add up). % :- func nest(int, T) = doc <= (doc(T)). % Identical to a nest doc except that indentation is extended with % a string label rather than some number of spaces. % :- func label(string, T) = doc <= (doc(T)). % A group doc gives the pretty printer a choice: if the doc can be printed % without line wrapping then it does so (all line, label, nest and group % directives within the group are ignored); otherwise the pretty printer % treats the group body literally, although nested group docs remain as % choice points. % :- func group(T) = doc <= (doc(T)). % This function can be used to convert strings, chars, ints, uints and % floats to their text doc equivalents. % % NOTE: since these types are now instances of the doc/1 type class, % it is simpler to just apply the doc/1 method to these types. % :- func poly(poly_type) = doc. % Shorthand for doc ++ line ++ doc. % :- func doc `</>` doc = doc. % Various bracketing functions. % % bracketed(L, R, Doc) = L ++ Doc ++ R % parentheses(Doc) = bracketed("(", ")", Doc) % brackets(Doc) = bracketed("[", "]", Doc) % braces(Doc) = bracketed("{", "}", Doc) % :- func bracketed(T1, T2, T3) = doc <= (doc(T1), doc(T2), doc(T3)). :- func parentheses(T) = doc <= (doc(T)). :- func brackets(T) = doc <= (doc(T)). :- func braces(T) = doc <= (doc(T)). % packed(Sep, [X1, X2, .., Xn]) = G1 `<>` G2 `<>` .. `<>` Gn where % Gi = group(line `<>` Xi `<>` Sep), except for Gn where % Gn = group(line `<>` Xn). % % For the singleton list case, packed(Sep, [X]) = group(line `<>` X). % % The resulting doc tries to pack as many items on a line as possible. % :- func packed(T1, list(T2)) = doc <= (doc(T1), doc(T2)). % A variant of the above whereby only the first N elements of the list % are formatted and the rest are replaced by a single ellipsis. % :- func packed(int, T1, list(T2)) = doc <= (doc(T1), doc(T2)). % packed_cs(Xs) = packed(comma_space, Xs). % % For example, to pretty print a Mercury list of docs one might use % % brackets(nest(2, packed_cs(Xs))) % :- func packed_cs(list(T)) = doc <= (doc(T)). % A variant of the above whereby only the first N elements of the list % are formatted and the rest are replaced by a single ellipsis. % :- func packed_cs(int, list(T)) = doc <= (doc(T)). % This is like a depth-limited version of packed_cs/1 that first calls % to_doc/2 on each member of the argument list. % :- func packed_cs_to_depth(int, list(T)) = doc. % This is like a version of packed_cs_to_depth/1 that first calls % univ_value/1 for each member of the argument list. % :- func packed_cs_univ_args(int, list(univ)) = doc. % separated(PP, Sep, [X1,...,Xn]) = % PP(X1) `<>` Sep `<>` ... Sep `<>` PP(Xn) % :- func separated(func(T1) = doc, T2, list(T1)) = doc <= (doc(T2)). % Handy punctuation docs and versions with following % spaces and/or line breaks. % :- func comma = doc. :- func semic = doc. % Semicolon. :- func colon = doc. :- func space = doc. :- func comma_space = doc. :- func semic_space = doc. :- func colon_space = doc. :- func comma_line = doc. :- func semic_line = doc. :- func colon_line = doc. :- func space_line = doc. :- func comma_space_line = doc. :- func semic_space_line = doc. :- func colon_space_line = doc. :- func ellipsis = doc. % "...". % Performs word wrapping at the end of line, taking whitespace sequences % as delimiters separating words. % % See `char.is_whitespace' for the definition of whitespace characters % used by this predicate. % :- func word_wrapped(string) = doc. % Convert arbitrary terms to docs. This requires std_util.functor/3 to work % on all components of the object being converted. The second version % places a maximum depth on terms which are otherwise truncated in the % manner described in the documentation for the doc/2 method of the doc/1 % type class. % % This may throw an exception or cause a runtime abort if the term % in question has user-defined equality. % :- func to_doc(T) = doc. :- func to_doc(int, T) = doc. % Convert docs to pretty printed strings. The int argument specifies % a line width in characters. % :- func to_string(int, doc) = string. % Write docs out in pretty printed format. The int argument specifies % a page width in characters. % :- pred write(int::in, T::in, io::di, io::uo) is det <= doc(T). % Write docs to the specified string writer stream in pretty printed % format. The int argument specifies a page width in characters. % :- pred write(Stream::in, int::in, T::in, State::di, State::uo) is det <= ( doc(T), stream.writer(Stream, string, State) ). %--------------------------------------------------% %--------------------------------------------------%
Next: pqueue, Previous: parsing_utils, Up: Top [Contents]