Expert Verilog, SystemVerilog & Synthesis Training
Simulation and Synthesis Techniques for Asynchronous
FIFO Design
Clifford E. Cummings, Sunburst Design, Inc.
cliffc@sunburst-design.com
ABSTRACT
FIFOs are often used to safely pass data from one clock domain to another asynchronous clock domain. Using a
FIFO to pass data from one clock domain to another clock domain requires multi-asynchronous clock design
techniques. There are many ways to design a FIFO wrong. There are many ways to design a FIFO right but still
make it difficult to properly synthesize and analyze the design.
This paper will detail one method that is used to design, synthesize and analyze a safe FIFO between different clock
domains using Gray code pointers that are synchronized into a different clock domain before testing for "FIFO full"
or "FIFO empty" conditions. The fully coded, synthesized and analyzed RTL Verilog model (FIFO Style #1) is
included.
Post-SNUG Editorial Comment
A second FIFO paper by the same author was voted “Best Paper - 1st Place” by SNUG attendees, is listed as
reference [3] and is also available for download.
Introduction
1.0
An asynchronous FIFO refers to a FIFO design where data values are written to a FIFO buffer from one clock
domain and the data values are read from the same FIFO buffer from another clock domain, where the two clock
domains are asynchronous to each other.
Asynchronous FIFOs are used to safely pass data from one clock domain to another clock domain.
There are many ways to do asynchronous FIFO design, including many wrong ways. Most incorrectly implemented
FIFO designs still function properly 90% of the time. Most almost-correct FIFO designs function properly 99%+ of
the time. Unfortunately, FIFOs that work properly 99%+ of the time have design flaws that are usually the most
difficult to detect and debug (if you are lucky enough to notice the bug before shipping the product), or the most
costly to diagnose and recall (if the bug is not discovered until the product is in the hands of a dissatisfied
customer).
This paper discusses one FIFO design style and important details that must be considered when doing asynchronous
FIFO design.
The rest of the paper simply refers to an “asynchronous FIFO” as just “FIFO.”
2.0 Passing multiple asynchronous signals
Attempting to synchronize multiple changing signals from one clock domain into a new clock domain and insuring
that all changing signals are synchronized to the same clock cycle in the new clock domain has been shown to be
problematic[1]. FIFOs are used in designs to safely pass multi-bit data words from one clock domain to another.
Data words are placed into a FIFO buffer memory array by control signals in one clock domain, and the data words
are removed from another port of the same FIFO buffer memory array by control signals from a second clock
domain. Conceptually, the task of designing a FIFO with these assumptions seems to be easy.
The difficulty associated with doing FIFO design is related to generating the FIFO pointers and finding a reliable
way to determine full and empty status on the FIFO.
2.1 Synchronous FIFO pointers
For synchronous FIFO design (a FIFO where writes to, and reads from the FIFO buffer are conducted in the same
clock domain), one implementation counts the number of writes to, and reads from the FIFO buffer to increment (on
FIFO write but no read), decrement (on FIFO read but no write) or hold (no writes and reads, or simultaneous write
and read operation) the current fill value of the FIFO buffer. The FIFO is full when the FIFO counter reaches a
predetermined full value and the FIFO is empty when the FIFO counter is zero.
Unfortunately, for asynchronous FIFO design, the increment-decrement FIFO fill counter cannot be used, because
two different and asynchronous clocks would be required to control the counter. To determine full and empty status
for an asynchronous FIFO design, the write and read pointers will have to be compared.
2.2 Asynchronous FIFO pointers
In order to understand FIFO design, one needs to understand how the FIFO pointers work. The write pointer always
points to the next word to be written; therefore, on reset, both pointers are set to zero, which also happens to be the
next FIFO word location to be written. On a FIFO-write operation, the memory location that is pointed to by the
write pointer is written, and then the write pointer is incremented to point to the next location to be written.
Similarly, the read pointer always points to the current FIFO word to be read. Again on reset, both pointers are reset
to zero, the FIFO is empty and the read pointer is pointing to invalid data (because the FIFO is empty and the empty
flag is asserted). As soon as the first data word is written to the FIFO, the write pointer increments, the empty flag is
cleared, and the read pointer that is still addressing the contents of the first FIFO memory word, immediately drives
that first valid word onto the FIFO data output port, to be read by the receiver logic. The fact that the read pointer is
always pointing to the next FIFO word to be read means that the receiver logic does not have to use two clock
periods to read the data word. If the receiver first had to increment the read pointer before reading a FIFO data
SNUG San Jose 2002
Rev 1.2
2
Simulation and Synthesis Techniques for
Asynchronous FIFO Design
word, the receiver would clock once to output the data word from the FIFO, and clock a second time to capture the
data word into the receiver. That would be needlessly inefficient.
The FIFO is empty when the read and write pointers are both equal. This condition happens when both pointers are
reset to zero during a reset operation, or when the read pointer catches up to the write pointer, having read the last
word from the FIFO.
A FIFO is full when the pointers are again equal, that is, when the write pointer has wrapped around and caught up
to the read pointer. This is a problem. The FIFO is either empty or full when the pointers are equal, but which?
One design technique used to distinguish between full and empty is to add an extra bit to each pointer. When the
write pointer increments past the final FIFO address, the write pointer will increment the unused MSB while setting
the rest of the bits back to zero as shown in Figure 1 (the FIFO has wrapped and toggled the pointer MSB). The
same is done with the read pointer. If the MSBs of the two pointers are different, it means that the write pointer has
wrapped one more time that the read pointer. If the MSBs of the two pointers are the same, it means that both
pointers have wrapped the same number of times.
Figure 1 - FIFO full and empty conditions
Using n-bit pointers where (n-1) is the number of address bits required to access the entire FIFO memory buffer, the
FIFO is empty when both pointers, including the MSBs are equal. And the FIFO is full when both pointers, except
the MSBs are equal.
The FIFO design in this paper uses n-bit pointers for a FIFO with 2(n-1) write-able locations to help handle full and
empty conditions. More design details related to the full and empty logic are included in section 5.0.
SNUG San Jose 2002
Rev 1.2
3
Simulation and Synthesis Techniques for
Asynchronous FIFO Design
2.3 Binary FIFO pointer considerations
Trying to synchronize a binary count value from one clock domain to another is problematic because every bit of an
n-bit counter can change simultaneously (example 7->8 in binary numbers is 0111->1000, all bits changed). One
approach to the problem is sample and hold periodic binary count values in a holding register and pass a
synchronized ready signal to the new clock domain. When the ready signal is recognized, the receiving clock
domain sends back a synchronized acknowledge signal to the sending clock domain. A sampled pointer must not
change until an acknowledge signal is received from the receiving clock domain. A count-value with multiple
changing bits can be safely transferred to a new clock domain using this technique. Upon receipt of an acknowledge
signal, the sending clock domain has permission to clear the ready signal and re-sample the binary count value.
Using this technique, the binary counter values are sampled periodically and not all of the binary counter values can
be passed to a new clock domain The question is, do we need to be concerned about the case where a binary counter
might continue to increment and overflow or underflow the FIFO between sampled counter values? The answer is
no[8].
FIFO full occurs when the write pointer catches up to the synchronized and sampled read pointer. The synchronized
and sampled read pointer might not reflect the current value of the actual read pointer but the write pointer will not
try to count beyond the synchronized read pointer value. Overflow will not occur[8].
FIFO empty occurs when the read pointer catches up to the synchronized and sampled write pointer. The
synchronized and sampled write pointer might not reflect the current value of the actual write pointer but the read
pointer will not try to count beyond the synchronized write pointer value. Underflow will not occur[8].More
observations about this technique of sampling binary pointers with a synchronized ready-acknowledge pair of
handshaking signals are detailed in section 7.0, after the discussion of synchronized Gray[6] code pointers.
A common approach to FIFO counter-pointers, is to use Gray code counters. Gray codes only allow one bit to
change for each clock transition, eliminating the problem associated with trying to synchronize multiple changing
signals on the same clock edge.
2.4 FIFO testing troubles
Testing a FIFO design for subtle design problems is nearly impossible to do. The problem is rooted in the fact that
FIFO pointers in an RTL simulation behave ideally, even though, if incorrectly implemented, they can cause
catastrophic failures if used in a real design.
In an RTL simulation, if binary-count FIFO pointers are included in the design all of the FIFO pointer bits will
change simultaneously; there is no chance to observe synchronization and comparison problems. In a gate-level
simulation with no backannotated delays, there is only a slight chance of observing a problem if the gate transitions
are different for rising and falling edge signals, and even then, one would have to get lucky and have the correct
sequence of bits changing just prior to and just after a rising clock edge. For higher speed designs, the delay
differences between rising and falling edge signals diminishes and the probability of detecting problems also
diminishes. Finding actual FIFO design problems is greatest for gate-level designs with backannotated delays, but
even doing this type of simulation, finding problems will be difficult to do and again the odds of observing the
design problems decreases as signal propagation delays diminish.
Clearly the answer is to recognize that there are potential FIFO design problems and to do the design correctly from
the start.
The behavioral model that I sometimes use for testing a FIFO design is a FIFO model that is simple to code, is
accurate for behavioral testing purposes and would be difficult to debug if it were used as an RTL synthesis model.
This FIFO model is only recommended for use in a FIFO testbench. The model accurately determines when FIFO
full and empty status bits should be set and can be used to determine the data values that should have been stored
into a working FIFO. THIS FIFO MODEL IS NOT SAFE FOR SYNTHESIS!
module beh_fifo (rdata, wfull, rempty, wdata,
winc, wclk, wrst_n, rinc, rclk, rrst_n);
SNUG San Jose 2002
Rev 1.2
4
Simulation and Synthesis Techniques for
Asynchronous FIFO Design
parameter DSIZE = 8;
parameter ASIZE = 4;
output [DSIZE-1:0] rdata;
output wfull;
output rempty;
input [DSIZE-1:0] wdata;
input winc, wclk, wrst_n;
input rinc, rclk, rrst_n;
reg [ASIZE:0] wptr, wrptr1, wrptr2, wrptr3;
reg [ASIZE:0] rptr, rwptr1, rwptr2, rwptr3;
parameter MEMDEPTH = 1<
3.0 Gray code counter - Style #1
Gray codes are named for the person who originally patented the code back in 1953, Frank Gray[6]. There are
multiple ways to design a Gray code counter. This section details a simple and straight forward method to do the
design. The technique described in this paper uses just one set of flip-flops for the Gray code counter. A second
method that uses two sets of flip-flops to achieve higher speeds is detailed in shown in section 4.0.
3.1 Gray code patterns
For reasons that will be described later, it is desirable to create both an n-bit Gray code counter and an (n-1)-bit
Gray code counter. It would certainly be easy to create the two counters separately, but it is also easy and efficient
to create a common n-bit Gray code counter and then modify the 2nd MSB to form an (n-1)-bit Gray code counter
with shared LSBs. In this paper, this will be called a “dual n-bit Gray code counter.”
Figure 2 - n-bit Gray code converted to an (n-1)-bit Gray code
To better understand the problem of converting an n-bit Gray code to an (n-1)-bit Gray code, consider the example
of creating a dual 4-bit and 3-bit Gray code counter as shown in Figure 2.
The most common Gray code, as shown in Figure 2, is a reflected code where the bits in any column except the
MSB are symmetrical about the sequence mid-point[6]. This means that the second half of the 4-bit Gray code is a
mirror image of the first half with the MSB inverted.
To convert a 4-bit to a 3-bit Gray code, we do not want the LSBs of the second half of the 4-bit sequence to be a
mirror image of the LSBs of the first half, instead we want the LSBs of the second half to repeat the 4-bit LSB-
sequence of the first half.
Upon closer examination, it is obvious that inverting the second MSB of the second half of the 4-bit Gray code will
produce the desired 3-bit Gray code sequence in the three LSBs of the 4-bit sequence. The only other problem is
that the 3-bit Gray code with extra MSB is no longer a true Gray code because when the sequence changes from 7
(Gray 0100) to 8 (~Gray 1000) and again from 15 (~Gray 1100) to 0 (Gray 0000), two bits are changing instead of
just one bit. A true Gray code only changes one bit between counts.
SNUG San Jose 2002
Rev 1.2
6
Simulation and Synthesis Techniques for
Asynchronous FIFO Design
3.2 Gray code counter basics
The first fact to remember about a Gray code is that the code distance between any two adjacent words is just 1
(only one bit can change from one Gray count to the next). The second fact to remember about a Gray code counter
is that most useful Gray code counters must have power-of-2 counts in the sequence. It is possible to make a Gray
code counter that counts an even number of sequences but conversions to and from these sequences are generally
not as simple to do as the standard Gray code. Also note that there are no odd-count-length Gray code sequences so
one cannot make a 23-deep Gray code. This means that the technique described in this paper is used to make a FIFO
that is 2n deep.
Figure 3 is a block diagram for a style #1 dual n-bit Gray code counter. The style #1 Gray code counter assumes that
the outputs of the register bits are the Gray code value itself (ptr, either wptr or rptr). The Gray code outputs
are then passed to a Gray-to-binary converter (bin), which is passed to a conditional binary-value incrementer to
generate the next-binary-count-value (bnext), which is passed to a binary-to-Gray converter that generates the
next-Gray-count-value (gnext), which is passed to the register inputs. The top half of the Figure 3 block diagram
shows the described logic flow while the bottom half shows logic related to the second Gray code counter as
described in the next section.
Figure 3 - Dual n-bit Gray code counter block diagram - style #1
3.3 Dual n-bit Gray code counter
A dual n-bit Gray code counter is a Gray code counter that generates both an n-bit Gray code sequence (described in
section 3.2) and an (n-1)-bit Gray code sequence.
The (n-1)-bit Gray code is simply generated by doing an exclusive-or operation on the two MSBs of the n-bit Gray
code to generate the MSB for the (n-1)-bit Gray code. This is combined with the (n-2) LSBs of the n-bit Gray code
counter to form the (n-1)-bit Gray code counter[5].
SNUG San Jose 2002
Rev 1.2
7
Simulation and Synthesis Techniques for
Asynchronous FIFO Design
3.4 Additional Gray code counter considerations
The binary-value incrementer is conditioned with either an “if not full” or “if not empty” test as shown in Figure 3,
to insure that the appropriate FIFO pointer will not increment during FIFO-full or FIFO-empty conditions that could
lead to overflow or underflow of the FIFO buffer.
If the logic block that sends data to the FIFO reliably stops sending data when a FIFO full condition is asserted, the
FIFO design might be streamlined by removing the full-testing logic from the FIFO write pointer.
The FIFO pointer itself does not protect the FIFO buffer from being overwritten, but additional conditioning logic
could be added to the FIFO memory buffer to insure that a write_enable signal could not be activated during a FIFO
full condition.
An additional “sticky” status bit, either ovf (overflow) or unf (underflow), could be added to the pointer design to
indicate that an additional FIFO write operation occurred during full or an additional FIFO read operation occurred
during empty to indicate error conditions that could only be cleared during reset.
A safe, general purpose FIFO design will include the above safeguards at the expense of a slightly larger and
perhaps slower implementation. This is a good idea since a future co-worker might try to copy and reuse the code in
another design without understanding all of the important details that were considered for the current design.
4.0 Gray code counter - Style #2
Starting with version 1.2 of this paper, the FIFO implementation uses the Gray code counter style #2, which actually
employs two sets of registers to eliminate the need to translate Gray pointer values to binary values. The second set
of registers (the binary registers) can also be used to address the FIFO memory directly without the need to translate
memory addresses into Gray codes. The n-bit Gray-code pointer is still required to synchronize the pointers into the
opposite clock domains, but the n-1-bit binary pointers can be used to address memory directly. The binary pointers
also make it easier to run calculations to generate "almost-full" and "almost-empty" bits if desired (not shown in this
paper).
Figure 4 - Dual n-bit Gray code counter block diagram - style #2
SNUG San Jose 2002
Rev 1.2
8
Simulation and Synthesis Techniques for
Asynchronous FIFO Design