Metastability in electronics

From Wikipedia, the free encyclopedia

For other uses of the term, see Metastability.

Metastability in electronics is the ability of a non-equilibrium electronic state to persist for a long (and theoretically unlimited) period of time (see asynchronous circuit). Note this definition does not guarantee all of the properties that are sometimes demanded for a metastable state in statistical mechanics. Usually the term is used to describe a state that doesn't settle into equilibrium within the time required for proper operation.

Contents

[edit] Flip-flops

In electronics, the flip-flop is a device that is susceptible to metastability. It has two well-defined stable states, traditionally designated 0 and 1, but under certain conditions it can hover between them for longer than a clock cycle. This condition is known as metastability. In most cases it is considered a failure mode of the logic design and timing philosophy or implementation.

The most common cause of metastability is violating the flip-flop's setup and hold times. During the time from the setup to the hold time, the input of the flip-flop should remain stable; a change in the input in that time will have a probability of setting the flip-flop to a metastable state.

In a typical scenario where data travels from the output of a source flip-flop to the input of target flip-flop, metastability is caused by either:

(1) the target clock having a different frequency than the source flip-flop, in which case the setup and hold time of the target flip-flop will be violated eventually, or

(2) the target and source clock having the same frequency, but a phase alignment that causes the data to arrive at the target flip-flop during its setup and hold time. This can be caused by fixed overhead or variations in logic delay times on the worst case path between the two flip flops, variations in clock arrival times (clock skew), or other causes.[1][2]

[edit] Arbiters

In electronics, the arbiter is a device that suffers from metastability. Arbiters are used in asynchronous circuits to order computational activities for shared resources to prevent concurrent incorrect operations.

[edit] Synchronous circuits

Synchronous circuit design techniques make digital circuits that are resistant to the failure modes that can be caused by metastability. A "clock domain" is defined as a group of flip flops with a common clock. Such architectures can form a circuit guaranteed free of metastability (below a certain maximum clock frequency, above which first metastability, then outright failure occur). However, this can only hold so long as they are not modelled as real physical systems with continuous state transition functions and dependence on a continuous input as these are provably vulnerable to metastable states.[3]

When synchronous design techniques are used, protection against metastable events causing systems failures need only be provided when transferring data between different clock domains or from an unclocked region into the synchronous system. The common style of request and acknowledge "mailbox flag" handshaking is one way to accomplish this, (at the cost of uncertainty as to whether the data will arrive in time to be transferred on any particular clock tick, or will have to wait for a later one).

Synchronous systems with one clock have another reliability advantage from an electrical noise point of view that is different from metastability and is distinguished from it. On each clock cycle of the system, before the clock is applied to the flip flops, there must have been enough time since the last clock for all flip flop outputs to be at a stable 0 or 1 level, and for all the signals derived from these levels to propagate through the gating to form stable electrical levels at the data input of all flip flops in the system. At the moment the clock pulse arrives at the flip flops, they read in these stable values. During this brief part of the cycle the flip flops are sensitive to electrical noise distorting the correct value of the data input. After a slight delay the outputs begin to change to the just read-in input values. This is followed by a large amount of electrical switching noise as these changes propagate through the gates. Eventually, after the maximum propagation time through the combinatorial logic, as set by the design, all flip flop data inputs will be stable once again. After a slight delay the next clock tick arriving at the flip flops will repeat this process.

In effect, the electrical noise is synchronous with the clock, and the flip flops take their new value at the quietest time in the cycle. When additional clocks not synchronized to the first are introduced, the electrical noise associated with these clocks will drift through time relative to the first clock. With straightforward statistics based on the probability of overlapping in time, this noise will challenge the data input of flip flops during the vulnerable moment they are reading in their new values. Metastability is a distinct issue, different from this electrical noise issue, although they are sometimes confused, as they both involve flip flops loading erroneous values and point to the need to minimize the number of independent clock sources in a circuit.

[edit] Failure modes

Although metastability is well understood and architectural techniques to control it are known, it persists as a failure mode in equipment.

Serious computer and digital hardware bugs caused by metastability have a fascinating social history. Many engineers have refused to believe that a bistable device can enter into a state that is neither "true" nor "false" and has a positive probability that it will remain indefinite for any given period of time, albeit with exponentially decreasing probability over time. However, metastability is an inevitable result of any attempt to map a continuous domain to a discrete one. There will always be points in the continuous domain which are equidistant (or nearly so) from the points of the discrete domain, making a decision as to which discrete point to select a difficult and potentially lengthy process [4]. If the inputs to an arbiter or flip-flop arrive almost simultaneously, the circuit most likely will traverse a point of metastability. Metastability remains poorly understood in some circles, and various engineers have proposed their own "pet circuits" said to "solve" or "filter out" the metastability. Typically these circuits simply shift the occurrence of metastability from one place to another.[5] Chips using multiple clock sources are often tested with tester clocks that have fixed phase relationships, not the independent clocks drifting past each other that will be experienced during operation. This usually explicitly prevents the metastable failure mode that will occur in the field from being seen or reported. Current engineering solutions to this problem are often the well-characterized, multistage common-clock shift registers discussed in the links below.

[edit] See also

[edit] References

  1. ^ Interfacing Two Clock Domains, Asic World
  2. ^ Chuck BenzFifos and Ring Buffers
  3. ^ Kleeman, L. and Cantoni, A. Metastable Behavior in Digital Systems DOI: 10.1109/MDT.1987.295189
  4. ^ Leslie Lamport, "Buridan's Principle". December 1984.
  5. ^ Ran Ginosar. "Fourteen Ways to Fool Your Synchronizer" ASYNC 2003.

[edit] External links

Languages