Asynchronous array of simple processors
From Wikipedia, the free encyclopedia
The asynchronous array of simple processors (AsAP) architecture comprises a 2-D array of reduced complexity programmable processors with small memories interconnected by a reconfigurable mesh network. AsAP was developed by researchers in the VLSI Computation Laboratory (VCL) at the University of California, Davis and achieves high performance and high energy-efficiency, while requiring a relatively small circuit area.
AsAP processors are well suited for implementation in future fabrication technologies, and are clocked in a Globally Asynchronous Locally Synchronous (GALS) fashion. Individual oscillators fully halt (leakage only) in 9 cycles when there is no work to do, and restart at full speed in less than one cycle after work is available. The chip requires no crystal oscillators, PLLs, DLLs, or any global frequency or phase-related signals whatsoever.
The multi-processor architecture efficiently makes use of task-level parallelism in many complex DSP applications, and also efficiently computes many large tasks utilizing fine-grain parallelism.
Contents |
[edit] Key features
AsAP utilizes a number of novel key features of which four are listed here.
- Chip multi-processor (CMP) architecture designed to achieve high performance and low power for many DSP applications.
- Small memories and a simple architecture in each processor to achieve high energy efficiency.
- Globally Asynchronous Locally Synchronous (GALS) clocking simplifies the clock design, greatly increases ease of scalability, and can be used to further reduce power dissipation.
- Inter-processor communication is accomplished using a nearest neighbor network to avoid long global wires and increase scalability to large arrays and in advanced fabrication technologies. Each processor can receive data from any two neighbors and send data to any combination of its four neighbors.
[edit] AsAP 1.0 Chip
A chip containing 36 (6x6) programmable processors was taped-out in May 2005 in 0.18μm CMOS using a synthesized standard cell technology and is fully functional. Processors on the chip operate at clock rates from 520MHz to 540MHz at 1.8V and each processor dissipates 32mW on average while executing applications at 475MHz. Most processors run at clock rates over 600MHz at 2.0V, which makes AsAP among the highest clock rate fabricated processors (programmable or non-programmable) ever designed in a university--it may be the highest ever published. At 0.9V, the average application power is 2.4mW at 116MHz. Each processor occupies only 0.66mm².
[edit] Applications
The coding of many DSP and general tasks for AsAP has been completed. Mapped tasks include: filters, convolutional coders, interleavers, sorting, square root, CORDIC sin/cos/arcsin/arccos, matrix multiplication, pseudo random number generators, Fast Fourier Transforms (FFT) of lengths 32-1024, a complete k=7 viterbi decoder, a JPEG encoder, and a complete fully compliant baseband processor for an IEEE 802.11g/11a wireless LAN transmitter. Blocks plug directly together with no required modifications whatsoever. Power, throughput, and area results are typically many times better compared to existing programmable DSP processors.
The architecture enables a clean separation between programming and inter-processor timing handled entirely by hardware. A recently completed C compiler and automatic mapping tool further simplify programming.
[edit] References
Baas, Bevan; Yu, Zhiyi; Meeuwsen, Michael; Sattari, Omar; Apperson, Ryan; Work, Eric; Webb, Jeremy; Lai, Michael; Mohsenin, Tinoosh; Truong, Dean; Cheung, Jason (March/April 2007). "AsAP: A Fine-grain Multi-core Platform for DSP Applications". IEEE Micro 27 (2).
Baas, Bevan; Yu, Zhiyi; Meeuwsen, Michael; Sattari, Omar; Apperson, Ryan; Work, Eric; Webb, Jeremy; Lai, Michael; Gurman, Daniel; Chen, Chi; Cheung, Jason; Truong, Dean; Mohsenin, Tinoosh (August 2006). "Hardware and Applications of AsAP: An Asynchronous Array of Simple Processors". In Proceedings of the IEEE HotChips Symposium on High-Performance Chips, (HotChips 2006).
Yu, Zhiyi; Meeuwsen, Michael; Apperson, Ryan; Sattari, Omar; Lai, Michael; Webb, Jeremy; Work, Eric; Mohsenin, Tinoosh; Singh, Mandeep; Baas, Bevan M. (February 2006). "An Asynchronous Array of Simple Processors for DSP Applications". In Proceedings of the IEEE International Solid-State Circuits Conference, (ISSCC '06): 428-429, 663.