MIT LCS
200 Technology Square
Cambridge, MA 02139
617-253-6837

CSG WWW

Fontsize
Computation Structures Group
Top People Publications Projects Courses Miscellaneous

Commit- Reconcile Fences

The memory model of a microprocessor is the semantics of load and store instructions. Recently, many "weak memory models" have arisen as a consequence architectural optimization, rather than from some grand high-level design. Many of these optimizations are transparent for uniprocessors but can be observed in multiprocessor because they break the Sequential Consistency model in one way or another. As a partial remedy all modern microprocessors now provide memory fences to control instruction reordering and overload their semantics to imply something vague about store completion. CRF, our recently proposed mechanism-oriented memory model, can be used as the universal interface between the compiler writer and the underlying architecture. It also enables the development of sophisticated cache coherence protocols.

Interesting Points

Many people have difficulty understanding why and how a memory model can affect them. Does it have impact on the way they write programs? Does it improve the performance? Does it affect the compiler or microarchitecture optimizations? The simple answer is that if we stick to uniprocessor systems and sequential programming the memory model is irrelevant. But, we rarely do! Even sequential programmers may initiate input/output tasks that run concurrently with the main program, and uniprocessors may have operating systems that are expressed as "cooperating sequential tasks." The fact is that the memory model affects everything but often in insidious and surprising ways. Many architectural mechanisms that are transparent in uniprocessors suddenly become visible in multiprocessors. A difference in implementation of memory related instructions could cause a subtle difference in observable behavior, giving rise to a different memory model. That is, the range of permissible behaviors is affected by just about every uniprocessor optimization. The fact that these range of behaviors have a lot of commonality or that the simple sequential execution of instructions is always a permitted behavior is hardly a source of comfort, because it is the additional behaviors that make parallel programs go wrong in surprising ways. Therefore, there is a need to pin down the exact definitions of memory instructions for multiprocessors. Since the same microprocessors and operating systems are used in uniprocessors and multiprocessors, a microprocessor designer has no option but to pay close attention to these subtle differences.

The design of cache coherence protocols plays an important role for shared memory systems and is intimately tied to the memory model. There have been many attempts to develop adaptive cache coherence protocols, but it is very difficult to verify the correctness of all but the simplest ones. A formal memory model that captures the semantics of a cache makes the verification task easier.


Approach

We investigate a new mechanism-oriented memory model called Commit-Reconcile & Fences (CRF), which exposes both data replication and instruction reordering at the ISA level. There are good reasons to be skeptical of yet another memory model. It is essential to have upward compatibility, i.e. the ability to run existing programs correctly on a machine with the CRF model. Furthermore, if compiler writers adopt CRF as the target model then it is important to have downward compatibility, i.e. the ability to run CRF programs well on existing machines. Indeed, CRF has both of these properties vis-a-vis most other existing memory models. Our model covers the spectrum from uniprocessor to SMP to DSM. CRF semantics permit most, if not all, current architectural optimizations. The semantics are clear and precise so that there are no fuzzy cases. By issuing memory access instructions out-of-order, architectural mechanisms, such as register renaming and speculative execution, can affect the observed program behavior in a multiprocessor setting. To date such properties of instruction issue have not been specified precisely. However, it is often possible to reason about program behaviors by making conservative assumptions, e.g. no speculation about instruction issue.
[an error occurred while processing this directive]