Eager Haskell

What is Eager Haskell?

Eager Haskell is an implementation of the Haskell programming language which by default uses eager evaluation. The four widely available implementations of Haskell---ghc, nhc, hugs, and hbc---all use lazy evaluation. The design of Eager Haskell permits arbitrary Haskell programs to be compiled and run eagerly. This web page highlights some of the techniques we are using to accomplish this.

An implementation of Eager Haskell for Eager Hackers will be available soon, hopefully within weeks. I've completed basically all the work and am writing up; I need to straighten out a few packaging issues before I have code that's really ready for prime time, though. Note that our parser and typechecker were advanced for 1993... We don't yet support records with named fields, nor do we provide qualified names or the stable namespace management that it entails. Both of these should not be difficult in principle to add, I just haven't had the time to do so. Support for non-standard but commonplace extensions such as multiparameter type classes and existential types doesn't conform to anyone's syntax, since most of it predates these efforts.

Why Eagerness?

Why use eager evaluation? There are a number of good reasons. First, eager evaluation is more intuitive. Every month there is a posting to the Haskell mailing list of the form "I wrote a tail-recursive loop, and hugs is running out of stack. This should be efficient. What's going on?" The answer is that the space behavior of lazy programs is difficult to grasp, and it can be especially frustrating to express iteration. Often explicit strictness annotations are required. This detracts severely from the declarative nature of the language. In Eager Haskell, tail-recursive computations will run in constant space just as they would in ML or C.

Second, we believe eagerness is more efficient than laziness. Lazy programs aren't allowed to evaluate many expressions until it can be proven that those expressions are used. Therefore, a great deal of time is spent creating and destroying suspended pieces of computation (thunks). All too often, these computations are simple enough that it would be just as easy to evaluate them instead. Faxen and others have used static analysis to expose such opportunities for eagerness. We instead propose using eagerness everywhere, while using mechanisms which permit us to recover if our program is too eager.

Finally, eager evaluation promises scalability. There are provably efficient techniques for scheduling eager programs on multiple processors. Our mechanisms for handling eagerness provide a natural way for identifying pieces of work large enough to be worth executing in parallel. At the same time, we can show that we will always make progress on the critical path of our programs. Thus, we should be able to run our programs efficiently on a shared-memory machine without the need for programmer annotations to control parallelism. This is in stark constrast to other approaches to parallelism in haskell, which require annotations to mark which computations may be speculatively evaluated, and to what degree of speculative evaluation is required.

For more detail on the potential tradeoffs and advantages of Eager Haskell, see my thesis proposal.

The Eager Haskell Evaluation Mechanism

There's one small problem with eager evaluation of lazy code: non-termination! (Or, more generally, bottom.) Lazy programs often use idioms which define an infinite data structure; this data structure is unfolded as it is needed, and unused pieces are discarded. A simple example is the list-numbering idiom:

> numberList xs = zip [0..] xs

In order to cope with this problem, Eager Haskell uses resource-bounded execution. The system tracks the resource usage as a computation unfolds. If resource demands get too high (for example because we are unfolding an infinite data structure as shown above) then we fall back to lazy execution. All outstanding computations are replaced by thunks. This fallback process effectively creates an image of the stack on the heap. However, we resume execution from the bottom of the stack. Thunks further up the stack are forced only as they are needed. More important, thunks which are never needed will become garbage and can be reclaimed. Once resumed, however, a computation proceeds eagerly just as before.

If we choose a constant upper bound on new resource consumption, resource-bounded execution should use at most a constant factor more space and time than lazy execution. The intuitive argument is simple: Every time we reach the resource bound, we fall back to laziness, so our space and time will increase by at most the resource bound. In practice we must set the resource bound fairly high (several hundred kilobytes, or when we're close to overflowing the stack)---the fallback mechanism is slow, and code is compiled with the basic assumption fallback is rare. This means that uncontrolled unfolding of infinite lists such as [0..] above will be quite inefficient in practice. Fortunately, such cases are comparatively easy to spot. The compiler can add laziness explicitly to uncontrolled loops, and the programmer is free to annotate the program for laziness. We expect such annotations will rarely be needed, and hope that it will be fairly obvious how they must be added when they are required.

We must be a little bit careful how we re-start suspended lazy execution if we want the best possible space bounds. Frequently fallback gives rise to long chains of dependent computations. When we require the head of such a chain, we must perform all the computations in the chain. The scheduler is careful to re-start such chains from the far end. Thus, if A depends on B depends on C, we first compute C, then B, then A. By re-starting computations in this order, we avoid consuming stack space for A and B while computing C. This is vital, as chains can become too long to fit on the stack. Moreover, careful scheduling means that the underlying execution mechanism need not actually be properly tail-recursive; if the stack fills up due to tail recursion, we will simply move it to the heap and place only the most recent iteration on the stack once again.

Resource-bounded computation using lazy fallback is also a natural way to schedule producer-consumer parallelism. This is a tricky problem as most computations make it difficult to identify producer and consumer in any kind of meaningful way: results produced in one place may be consumed in many different places over time, or results from many sources may be bundled together into a more complex data structure. Eagerness means that producers can operate undisturbed (and eventually in parallel). Resource bounds limit the total amount of data which is produced without requiring a notion of which resources belong to which producer. Lazy fallback means that consumers are permitted to run as resources are used up, thereby freeing them, and that producers are re-started when more data is required.

One final detail must be handled specially: Error expressions. Eager execution may encounter calls to error functions and exception handlers which would not have been reached during lazy execution. This code must be treated specially: simply marking it as "lazy" is insufficient, since eager computations may force lazy thunks. Instead, we have a special class of bottom thunks which are forced only if they are found to be necessary by the scheduler when it resumes execution after fallback. These expressions are identified statically by the compiler using bottom extraction, which is also used to simplify a number of program transformations involving error-handling code.

The relationship between Eager Haskell and pH

Before there was an Eager Haskell, there was pH, a parallel Haskell. Both languages use eager evaluation. Both share the goal of running unannotated programs with linear speedup on a parallel machine. Both share the syntax and type system of Haskell, and use the same compiler front end. However, pH does not have a lazy fallback mechanism. There are therefore many Haskell programs which do not run correctly in pH. However, pH provides a number of mechanisms which do not exist in Eager Haskell. The most important of these is the barrier mechanism. This permits programs to detect deep termination, when all eager computations in a particular program region have executed to completion. Both pH and Eager Haskell include libraries of I- and M-structure operations, but only pH integrates them deeply into the language. In Eager Haskell such side-effecting constructs must be used with care, normally through a (still-unwritten) monadic interface. Thus the two languages have complementary, but somewhat different, purposes.