pH: a parallel Haskell
This note describes pH, a parallel variant of Haskell. It is also
an invitation to join this effort.
- Arvind (MIT)
- Lennart Augustsson (Chalmers)
- Jamey Hicks (Motorola MCRC)
- Rishiyur Nikhil (DEC CRL)
- Simon Peyton Jones (Glasgow)
- Joe Stoy (Oxford)
- John Williams (IBM research)
pH is an attempt to draw together the lazy functional community (as
represented by Haskell) and the dataflow community (as represented by
Id and Sisal). By sharing a large common language base (Haskell), we
hope that we will find it easier to share ideas and implementations,
exchange programs and reduce harmful diversity. Application
programmers will find it easier to grapple with only one syntax and
We do not seek uniformity. There are clear differences between pH and
traditional Haskell. But having a common framework will allow us to
highlight and study these differences more easily than when the
languages involved differed more widely.
We hope that pH will become a collaborative effort, as the Haskell
language itself is, rather than being associated with any one
research group. If enough people are interested, we will form some
kind of committee and mailing list to steer the language. If you are
interested in joining in the development of pH, or just want to
comment on it, send email to pH@abp.lcs.mit.edu. If you would like
to subscribe to the pH mailing list, send email to
What is pH?
pH is all about parallelism.
There is more than one approach to parallel implementations of
functional languages. For example, it is possible to exploit the
implicit parallelism of a Haskell program by concurrent evaluation of
the arguments of strict operators. The absence of side effects
ensures that this concurrent evaluation cannot change the result.
Strictness analysis techniques widen the scope of this idea, by
allowing parallel evaluation of any strict function argument.
Further, it is possible to provide programmer annotations to indicate
sub-expressions which should be evaluated in parallel, even when the
compiler cannot prove that the value of all the sub-expressions will
be required. Unlike similar annotations in imperative languages, the
annotations can only affect termination, and not the results (if any).
All these approaches take *demand-driven* evaluation as the starting
point: in the absence of annotations, expressions are evaluated only if
their value is sure to be required.
An alternative approach is to use a *parallel* evaluation order: that
is, all redexes are reduced in parallel, not merely those whose
results are sure to be needed. This strategy also implements
non-strict semantics, provided that no redex has its evaluation
delayed indefinitely. A parallel evaluation strategy is followed by
the Id programming language, in which any redex not in the body of
a conditional or lambda abstraction will be reduced.
pH is a variant of the Haskell programming language which follows Id's
parallel evaluation strategy:
pH = Haskell syntax and type system
+ Id evaluation strategy
In practice, implementations of pH are unlikely to guarantee that
every redex will eventually be reduced, which in turn may mean that
some programs fail to give a result and terminate when they would do
so in "traditional" Haskell. However, if pH gives a proper
(non-error, non-bottom) result for a particular program, then so does
"traditional" Haskell, and, these results are the same.
Both Haskell and Id provide arrays, constructed using array
comprehensions. But sometimes array comprehensions aren't quite what
you want. So Id provides I-structures as the "nuts and bolts" through
which array comprehensions are implemented, and which can also be used
by the sophisticated programmer to implement other encapsulated
functional data structures. I-structures are interesting in their own
right: via type declarations they can also be used to allow arbitrary
data structures with assignable "holes" in them. However, each of
these holes can be filled at most once. I-structures are closely
related to *logic variables* and are used in similar manner in
languages like KL/1 and Strand.
I-structures depend critically on a precise specification of the
evaluation strategy --- specifically, it must be easy to tell just
what will be evaluated and what will not. Id's parallel evaluation
strategy makes this question easy to answer, in contrast to a
demand-driven strategy where it is hard to say just what will get
evaluated. (The notion of a control block in an Id program, that is,
a group of statements that is guaranteed to be completely executed
once initiated, corresponds closely to similar notions in imperative
languages.) I-structures preserve determinacy (aka the Church-Rosser
property) but do not allow duplication (and hence free-wheeling
substitution) of unevaluated expressions, since single assignments may
thereby become multiple assignments, which are forbidden.
pH adopts I-structures.
(Historical note. The introduction of array comprehensions has done
much to eliminate the need for explicit I-structure programming, so
that I-structures are now much less prominent in Id programs than they
used to be. Nowadays, I-structures are mainly used by the compiler in
desugaring of array and list comprehensions, library code, or in
certain performance-critical loops which, for example, generate two
arrays at once. They are also used extensively as a way to express
complex constructs in more primitive forms.)
State and assignment
Purely-functional languages deliberately lack assignment, though this
leads to a lack of expressiveness which is sometimes trying. Strict
(almost)-functional languages, such as SML or Scheme, often
re-introduce assignment, but at the cost of specifying a precise
sequential evaluation order for all programs, even when it would not
otherwise be required. Introducing state into non-strict functional
languages is an active area of research (e.g. monads, mutable abstract
data types), but these approaches come mostly at the cost of imposing
some sequentiality, in order to preserve a well-defined functional
Id demonstrates an alternative approach, based on M-structures, which
does not come at the cost of ubiquitous sequentiality. It doesn't
come for free either -- there is a substantial semantic cost -- but
M-structures allow one to express some parallel algorithms which
simply cannot be expressed without them. Proper programming with
M-structures requires more fine grain control over the default
evaluation order of Id. This is achieved through the use of local
barriers, that is, sequencing of a group of statements, in Id.
M-structures lose determinacy.
pH adopts M-structures, but we recognize that they are still immature.
We hope, that with some experience, a more disciplined use of
M-structures will evolve and will be more tractable semantically.
pH is a layered language:
pH(F) is the purely-functional subset of pH. Its main addition to
Haskell is for- and while-loops. For example, one could compute the
sum of the integers between 1 and n like this:
let sum = 0
in for i <- [1..n] do
next sum = sum + i
Loops are purely functional, and can easily be translated into a tail
recursive function. Ability to group statements for sequential
execution is also part of pH(F), and may be used for resource control.
pH(I) adds I-structures to pH(F).
pH(M) adds M-structures to pH(I). Full pH is the same as pH(M).
Stricter versions of pH:
It is well known that it is often more difficult to compile non-strict
languages efficiently than the corresponding strict languages. Thus
it also makes sense to explore another dimension: making pH stricter
in different ways. Three variants immediately come to mind:
All these variants should affect only the termination and not the
answers produced. However, for pH(M), these variants would also reduce
the set of possible solutions.
- Make function applications strict;
- Make all data structures strict;
- Both of the above.
Pragmas and Other Annotations:
We expect that there will be a common set of pragmas as well as
implementation-specific sets of pragmas for efficient compiling of pH
on a variety of architectures. In particular, we expect to introduce
storage layout pragmas in the spirit of HPF (High Performance
FORTRAN), to ease the compilation on distributed memory machines.
Each dialect of Haskell may have different pragmas, but their correct
use should not affect the result of the un-pragma'd program.
Initial implementations of pH will come from Arvind's group at the MIT
Lab for Computer Science, and from Nikhil at the DEC Cambridge
Research Lab. These implementations would rely on Haskell front ends
produced by other groups. Ultimately, they will replace their authors' Id
implementations and tools. Mechanical translators from Id to pH will
be easy to create --- the process is little more than transliteration.
 Report on Programming Language Haskell: A Non-strict, Purely Functional
Language, Version 1.2, ACM SIGPLAN Notices, Vol 27 No 5, May 1992.
 Nikhil, Rishiyur S., Id Language Reference Manual, Version 90.1,
Computation Structures Group Memo 284-2, Laboratory for Computer Science,
M.I.T., July 15 1991.
email to Haskell mailing list, September 1, 1993