# MASSACHUSETTS INSTITUTE OF TECHNOLOGY Laboratory for Computer Science Computation Structures Group Memo 145 The Synchronizer Problem by Glen Seth Miranker This paper was prepared with the partial support of the National Science Foundation under grant DCR75-84060. January 1977 ### Introduction Most digital systems are synchronous in nature. That is they perform a single step of their task in response to an internal clock tick. Frequently, these systems must respond to external signals as well — for example a computer receiving interrupts. In general such a system, at every clock tick must "decide" whether to continue with its normal stream of operation, or respond to the external signal. This requires deciding whether the external signal is true or false 10 or 1) before continuing operation. Since the machine must execute a processing step at each clock tick there is a bounded amount of time to make this decision. One might argue that it would be sufficient to use some bounded number of clock ticks, or even a potentially unbounded number, provided we had some means of knowing at which tick the decision was finally made. However, these alternate criteria are easily shown to be equivalent [5]. A device called a synchronizer is used by the receiver to aid this decision process. Such a device is depicted schematically in figure 1. The desired operation of the device S is simple. We assume that the synchronizer is originally reset – its output is <u>false</u>. At some time $t_c$ , a clock pulse (which is generated by the receiver) arrives at the synchronizer. If the external signal is <u>true</u>, then the output of the synchronizer will be <u>true</u> at some fixed time $t_d = t_c + \tau_d$ , where $\tau_d$ is the intrinsic delay of the synchronizer. Otherwise the output is <u>false</u>. Using such a device the receiver can examine the output of S at some time $t_0 \geq t_d$ , as shown in figure 2, to determine if the external signal was asserted at the last clock tick. If it was, it may choose to take some action, for example start an interrupt routine at the next clock tick. The question arises as to what occurs if the external signal changes at about the same time as the clock. (Without lose of generality, we will assume that the signal changes only from false to true (8 to 1) asynchronously, and resumes the false state only when instructed to do so by the receiver.) Usually, one only cares that the decision of the synchronizer be made by time $t_{d}$ . If the external signal changes too close to the clock and is "missed" (i.e. the output of the synchronizer remains false) then the next clock pulse will result in the output changing to true by time $t_{d}$ . Unfortunately it is not known how to achieve such behaviour in practice. In this note we will briefly discuss anomolous synchronizer operation, and analyze the behaviour of one common approach to improving synchronizer performance. #### The Problem All known techniques for building synchronizers use a bistable device called a flip flop or its equivalent. It has been proven by Hurtado (4) that for synchronizers built in this fashion - 1) Regions of metastability exist. That is under certain conditions, a flip flop's output will assume a value intermediate between true and false, or will oscillate between these two levels. Which of the two (anomolous) behaviours is exhibited is a function of the technology used to implement the flip flop, and is essentially independent of how it was driven into the metastable region. - 2) If a synchronizer is in its metastable region, it will remain there indefinately in the absense of any external signals or noise. Flip flops tend to enter their regions of metastability when the skew between the arrival of the clock pulse and the external signal changing from faise to true is small. The metastable region is always exited since all real systems contain noise. However, flip flops have been observed to take from two to ten times their nominal delay times to leave the metastable region [2]. The difficulty is compounded by the fact that for any synchronizer constructed of bistable elements, one cannot be certain that the synchronizer will have settled at any time $t \geq t_d$ . In fact it is only determinable probabilistically, and is given by a function of the form P(t) $\triangle$ {probability that the synchronizer has not exited the metastable region by time t > $t_d$ } = $\alpha e^{-t/\tau}$ where $\alpha$ , $\tau_d$ , and $\tau$ , are dependent on the circuit implementation and the clock frequency. We can get a feel for the magnitude of the times involved by considering a concrete example. Suppose не use a schottky TTL D flip flop for the synchronizer element, and a system clock period of 350 nanoseconds. Then $\tau \sim 1.1 \text{ns}, \ \alpha \sim 2. \ \text{and} \ \tau_d \sim 10 \ \text{ns}. \ [7]$ Thus the nominal settling time of the flip flop is 18 ns. If we choose to observe the output of the device 20ns after the clock, and we assume that input signals arrive at the flip flop at a rate of $10^4/\text{second}$ , then we get a failure rate of about 2/second. If we increase $t_0$ to 38ns then the failure rate becomes about 1/hour. And if we walt 40ns - 4 times the nominal delay - then the failure rate drops to once per year. One might ask if some circuit technique might reduce the maiting time for a given reliability. While there are some circuits that do improve reliability substantially, they have large overhead in terms of circuitry [1]. Also they tend to have rather large $\tau_d$ which seems to indicate that there is some sort of inherent trade-off between reliability and time [6]. Here we wish to demonstrate that one common and an intuitively appealing approach to achieving higher reliability for a given delay, in fact provides only a very modest improvement by doing a "gedanken experiment". Theoretical and laboratory studies have shown that the probability that a flip flop is metastable at time t, given that it was metastable at time 0 is $e^{-t/\tau}$ . Thus we can regard P(t) as having two parts: ``` P(t) = P(flip flop is metastable e \cdot \tau_d × P(flip flop is metastable e \cdot t > \tau_d|metastability e \cdot \tau_d| \triangleq P_M \times P_E ``` Thus the constant a (equation top of page 3) is a measure of the synchronizers propensity to enter the metastable region. \* characterizes the synchronizers ability to exit the metastable region, and is sometimes referred to as the synchronizers figure of merit. Clearly, $P_M$ is some function of the skew between the clock and the signal to be synchronized, and it is referred to as the absolute conflict window. However, suppose we only allow inputs to the synchronizer that yield a uniform distribution of skews between the signal and clock. Then if we are only interested in the statistics of the synchronizer, we can regard $P_M$ as having a uniform height 1 distribution, with an area equal to the area of the "real" $P_M$ . This new probability, $E_C$ is called the effective conflict window. It is a unit step from $[-W_C/2,W_C/2]$ , where $W_C$ is the area under the distribution function of $P_M$ . This simplification has an appealing intuitive interpretation. It may be understood to imply that the flip flop never enters the metastable state for skews > $W_C/2$ , and always enters the metastable region for a skew time $\leq W_C/2$ . We can further justify this simplification by considering an experiment consisting of a large number of identical trials on a single flip flop. In each trial we choose a skew time from a uniform distribution over the interval [-T,T] and apply a pair of inputs to the synchronizer consisting of a clock and an input signal with the selected skew. At each trial we observe the flip flop at time $\tau_d$ . At the end of the experiment we compute the ratio of the number of trials in which we observed a flip flop in the metastable state to the total number of trials. From this number R, we compute the width of the effective conflict window as T/R. Notice that there is no way to distinguish by the experiment between a flip flop with behaviours $P_M$ and $W_C$ . Critical to this observation is the fact that the skew times be taken from a uniform distribution, since otherwise we could deduce the shape of $P_M$ by selectively concentrating the skew times in various regions. In order to do our gedanken experiment, first consider an experiment consisting of a large number of trials N. At each trial we choose a skew time between the clock and input signal to the synchronizer from a uniform distribution with width [-T/2, T/2], where T is selected so that the probability of anomalous behaviour is negligible (i.e. T >> $W_c$ ). Now we apply the signals to the synchronizer and test whether its output is metastable at times $\tau_d$ , $2\tau_d$ , $3\tau_d$ , . . . We will obtain a curve as shown in figure 3. From this curve we can measure $\tau_s$ and $W_c$ . We can apply a similar analysis to one likely candidate for an improved synchronizer, which is shown in figure 4. It seems that this device would be superior to a single synchronizer $\mathcal B$ since in order for it to fail, the first synchronizer must fail and <sup>1)</sup> Exit the metastable region within the effective conflict window of the second synchronizer. <sup>2)</sup> Remain in the metastable region through time $t_c+\delta+V_C/2$ , where $t_c$ is the clock time of the first synchronizer. Equivalently, the 2 stage synchronizer fails if the first flip flop is metastable for some time t $\geq$ t<sub>c</sub> + 1 - $W_{\rm C}/2$ . For an experiment consisting of N trials of the kind just described [where we observe the output of the second synchronizer of the two stage synchronizer $\beta$ ), the number of times $\beta$ will fail is simply So the effective conflict window of the two stage synchronizer $\mathcal{B}'$ $\mathbb{W}_{2C} \triangleq \mathbb{W}_{C} e^{-(\xi - (\mathbb{W}_{C}/2) - r_{d})/\tau}$ is improved by a factor $$g = (\delta - (W_C/2) - \tau_d) / \tau$$ It appears that the probability of metastability for the two stage synchronizer $P_2(t)$ , is improved by a factor exponential in §. This conclusion is unfounded however. The root of this confusion is that the increased minimum delay time of $\beta'$ is ignored. The relevant comparison then is the ratio I of $$P(t) = U_{C} e^{-(t-r_d)/r}$$ to $P_2(t) = U_{2C} e^{-(t-(2r_d+8))/r}$ This can be simplified to yield $$I = e^{-(2\tau_d + (\mu_c/2))/\tau}$$ Notice that the improvement is constant and not exponential in 5. For a schottky flip flop and a 350ns clock we find that I ~ 10-9 an improvement of about nine orders of magnitude. Though this improvement may seem awasome, it must be remembered that the same improvement is possible by waiting another 20ns with a single stage flip flop. Thus with schottky technology, the extra flip flop improves things by about 20ns. Until it is discovered how to build synchronizers without bistable elements, this speed/hardware/reliability tradeoff seems likely to remain. This ubiquitous problem is often referred to as the synchronizer problem. Even non-synchronous systems sometimes must come to grips with it — for example a dynamic memory i.e. one which must periodically refresh its contents. Though the memory is truly an asynchronous device, it must decide between granting a read/write request, and refreshing itself. If too long a time elapses between refreshes, the information in the memory is lost. The system is similar to a synchronous one in that a decision must be made between two operations in a bounded amount of time. FIGURE 3 FIGURE 4 # **Acknowledgements** This paper is a highly abstracted version of a tutorial on arbiters and synchronizers presented by the author in a seminar on communicating parallel processes (given fall of 1976, Carl Hewitt supervising). The author wishes to thank professor Hewitt for his comments on an earlier draft of this paper. ## **BIBLIOGRAPHY** - 1. Catt, L. Time Loss Through Gating of Asynchronous Logic Signal Pulses-IEEE Transactions on Electronic Computers, vol. EC-15, February 1966. - 2. Chaney, T. J., Molnar, C. E. Anomolous Behaviour of Synchronizer and arbiter Circuits. IEEE Transactions on Computers, vol. C-22, April 1973. - 3. Couranz, G. R. Theoretical and Experimental Behaviour of Synchronizers Operating in the Metastable Region. IEEE Transactions on Computers, vol. C-24, no. 6, June 1975. - 4. Hurtado. M. Structure and Performance of Asymptomically Bistable Dynamic Systems. D.Sc. Dissertation, Washington University, St. Louis, Missouri, 1974. - 5. Patil, S. S. Bounded and Unbounded Delay Synchronizers and Arhiters. Computation Structures Group Nemo 103, Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA. June 1974. - 6. Pechoucek, M. Anomolous Response Times of Input Synchronizers. IEEE Transactions on Computers, February 1976. - 7. Wann. D. F., et al A Fundamental Problem Associated With The Physical Realization of Certain Classes of Petri Nets. Computer Systems Laboratory. Washington University, St. Louis, Missouri.