Wormhole switching
Wormhole flow control, also called wormhole switching or wormhole routing is a system of simple flow control in computer networking based on known fixed links. It is a subset of flow control methods called Flit-Buffer Flow Control.[1]:Chapter 13.2.1
Switching is a more appropriate term than routing, as "routing" defines the route or path taken to reach the destination.[2] The wormhole technique does not dictate the route to the destination but decides when the packet moves forward from a router. Cut-through switching, commonly called "virtual cut-through," operates in a similar manner, the major difference being that cut-through flow control allocates buffers and channel bandwidth on a packet level, while wormhole flow control does this on the flit level. In most respects, wormhole is very similar to ATM or MPLS forwarding, with the exception that the cell does not have to be queued.
Wormhole switching is sometimes called cut-through switching.[3]
Large network packets are broken into small pieces called FLITs (flow control digits). The first flit, called the header flit, holds information about this packet's route (namely the destination address) and sets up the routing behavior for all subsequent flits associated with the packet. The head flit is followed by zero or more body flits which contain the actual payload of data. The final flit, called the tail flit, performs some bookkeeping to close the connection between the two nodes.
One thing special about wormhole flow control is the implementation of virtual channels:
A virtual channel holds the state needed to coordinate the handling of the flits of a packet over a channel. At a minimum, this state identifies the output channel of the current node for the next hop of the route and the state of the virtual channel (idle, waiting for resources, or active). The virtual channel may also include pointers to the flits of the packet that are buffered on the current node and the number of flit buffers available on the next node.[1]:237
The name "wormhole" plays on the way packets are sent over the links: the address is so short that it can be translated before the message itself arrives. This allows the router to quickly set up the routing of the actual message and then "bow out" of the rest of the conversation. Since a packet is transmitted flit by flit, it may occupy several flit buffers along its path, creating a worm-like image.
Example
A wormhole flow control transmission may work as follows. Each node contains a router that will determine which path the packet will take through the network and holds the virtual channel state information:
- A packet, Pa at an upstream node, say N1, attempts to allocate an input virtual channel on a downstream node, N2, to reach its destination, N3. The input VC (Virtual Channel) at each side of each node (call them N,S,E,W) will hold flit buffers and, in this case, will specify whether this input virtual channel is waiting, idle, or active. It will also specify which output virtual channel we are attempting to acquire. An output VC will hold information about only which input virtual channel it is reserved by.
- Pa's header flit arrives at N2's West input VC, which happens to be in the idle state, so assuming we can buffer two flits, Pa's header flit and first body flit are buffered.
- Pa wants to use N2's East output VC to reach N3 so it specifies that in the VC state, but this output VC is currently being used by some other packet, Pb coming from the North. Pa is now blocked, so the West input VC on N2 will enter the wait state. Note that N2's East output VC will specify that it is reserved by the North input VC. N1 cannot send any more flits to N2 now because the flit buffer is full.
- Pb finishes transmitting and N2's East output VC becomes available.
- Pa can now transmit to N3 so the West input VC enters the active state, and the East output VC specifies that it is reserved by W.
- Pa continues this transmission process until it reaches its destination.
Note that when Pa was blocked by Pb, the upstream node could not transmit any more packets downstream. This may extend upstream all the way to the source node as flit buffers fill up due to the blocking. This is an example of backpressure.
Advantages
- Wormhole flow control makes more efficient use of buffers than cut-through. Where cut-through requires many packets worth of buffer space, the wormhole method needs very few flit buffers (comparatively).
- An entire packet need not be buffered to move on to the next node, increasing throughput.
- An entire packet need not be buffered to move on to the next node, decreasing network latency compared to store-and-forward switching.
- Bandwidth and Channel allocation are decoupled
Wormhole techniques are primarily used in multiprocessor systems, notably hypercubes. In a hypercube computer each CPU is attached to several neighbours in a fixed pattern, which reduces the number of hops from one CPU to another. Each CPU is given a number (typically only 8-bit to 16-bit), which is its network address, and packets to CPUs are sent with this number in the header. When the packet arrives at an intermediate router for forwarding, the router examines the header (very quickly), sets up a circuit to the next router, and then bows out of the conversation. This reduces latency (delay) noticeably compared to store-and-forward switching that waits for the whole packet before forwarding. More recently, wormhole flow control has found its way to applications in Network On Chip systems (NOCs), of which multi-core processors are one flavor. Here, many processor cores, or on a lower level, even functional units can be connected in a network on a single IC package. As wire delays and many other non-scalable constraints on linked processing elements become the dominating factor for design, engineers are looking to simplify organized interconnection networks, in which flow control methods play an important role.
An extension of worm-hole flow control is Virtual-Channel flow control, where multiple virtual channels are provided for each input port.
See also
References
- 1 2 William James Dally; Brian Towles (2004). "13.2.1". Principles and Practices of Interconnection Networks. Morgan Kaufmann Publishers, Inc. ISBN 978-0-12-200751-4.
- ↑ John L. Hennessy and David A. Patterson (2006). "Appendix E.5". Computer Architecture: A Quantitative Approach (Fourth ed.). Morgan Kaufmann Publishers, Inc. ISBN 978-0-12-370490-0.
- ↑ Stefan Haas. "The IEEE 1355 Standard: Developments, Performance and Application in High Energy Physics". 1998. p. 59.