Fresh Breeze: A Novel Multiprocessor Chip

Jack B. Dennis

Introduction

Since its inception, the Computation Structures Group has worked on issues at the boundary between computer architecture and programming methodology. The emphasis has been on providing support for exploiting parallelism in computations while maintaining, or improving, the ease of developing robust software. Recognizing that the functional programming style provides a medium for expressing parallelism without violating principles of modular software construction, the group has focused on computer architecture concepts and principles that offer the ability to execute efficiently programs expressed in the functional style. The Monsoon project [2] was an important milestone in demonstrating that appropriately designed processors can achieve effective parallel execution of software written without use of explicit operations for indicating possibilities for parallel execution. Another emphasis of the group's work has been on the benefits of implementing a global address space for all information, data and procedures, shared by users of a computer system. These benefits were demonstrated in the Project MAC Multics system and have been realized in the commercial world in the successful AS/400 series of computer systems offered by IBM.

The Fresh Breeze project aims to realize these benefits through a new single-chip multiprocessor, making an attack on the problem considered to be the major issue in contemporary computer architecture: how to organize multiple processors on a chip so that their power can be effectively coordinated and applied.

Approach

The proposed multiprocessor chip [1] will incorporate three ideas that are significant departures from conventional thinking about multiprocessor architecture:

Simultaneous multithreading. Simultaneous multithreading [3] has been shown to have performance advantages relative to contemporary superscalar designs. This advantage can be exploited through use of a programming model that exposes parallelism in the form of multiple threads of computation.

Global shared address space. The value of a shared address space is widely appreciated.

Through the use of 64-bit pointers, the conventional distinction between ``memory'' and the file system can be abolished. This will provide a superior execution environment in support of program modularity and software reuse.

No memory update; cycle-free heap. Data items are created, used, and released, but never modified once created. The allocation, release, and garbage collection of fixed-size chunks of memory will be implemented by efficient hardware mechanisms. A major benefit of this choice is that the multiprocessor cache coherence problem vanishes: any object retrieved from the memory system is immutable. In addition, it is easy to prevent the formation of pointer cycles, simplifying the design of memory management support.

The proposed chip architecture is shown in the drawing below. There are several multithread processors (MP) that communicate with Instruction Association Units (IAU) for access to memory chunks containing instructions, and with Data Association Units (DAU) for access to memory chunks containing data. The association units act like cache memories, retrieving a chunk from off-chip memory if its 64-bit global identifier (address) does not match the tag of any on-chip chunk. Each of the multithread processors supports simultaneous execution of instructions from four independent activities that share function units of the MP. The Scheduler Unit maintains records of activities awaiting assignment to processors and makes assignments as activity slots become free. Interfaces are provided for input/output messaging and for operating several chips as part of a coherent multi-chip system.

It is intended that the Fresh Breeze chip provide a sound base for the execution of programs written following principles of modular software construction. These principles include information hiding, context independence and the use of hierarchy. The idea of encapsulating representations and operations on data structures introduced in Simula 67 and the Clu language of Prof. Barbara Liskov showed that a garbage-collected heap is essential to achieve the full benefit of support for modular software. These principles have implications for the architecture of computer systems. One must be able to identify a unit of software that enforces well-defined interfaces with other modules and eliminates dependences on the context of use. The system must implement resource management rather than permitting modules to make private resource allocation decisions. To support the passing of arbitrary data objects between program modules, there can be no separate file system, which implies use of a global address space.

Java is a popular programming language that has adopted design principles long favored by language designers. Java is (mostly) type secure, and assumes automatic memory management (garbage collection). The standardized bytecode form for Java classes makes development of an implementation for a novel target machine relatively easy. For these reasons a restricted version of Java will be used for writing Fresh Breeze applications.

;

Future

Present plans are to demonstrate the proposed system architecture by developing a cycle-accurate simulator for the multiprocessor chip, and a compiler that translates and optimizes Java bytecode class files to machine code. Once these tools are in place, the merit of the Fresh Breeze ideas will be tested for a variety of applications. Without a conventional file system, different approaches to data backup and recovery, user accounts and security, load balancing, etc., will need to be developed. Success in this work will provide justification for development of an FPGA prototype of the multiprocessor in further research.

References

[1] J. B. Dennis. A parallel program execution model supporting modular software construction. In Third Working Conference on Massively Parallel Programming Models. IEEE Computer Society, 1998, pp 50--60.

[2] G. M. Papadopoulos and D. E. Culler. Monsoon: an explicit token-store architecture. In Proceedings of the 17th Annual International Symposium on Computer Architecture, IEEE Computer Society, 1990, pp 82--91.

[3] D. M. Tullsen, S. J. Eggers, and H. M. Levy. Simultaneous Multithreading: Maximizing on-chip parallelism. In Proceedings of the 22nd Annual International Symposium on Computer Architecture, IEEE Computer Society, 1995, pp 392--403.