Heterogeneous computing
Heterogeneous computing refers to systems that use more than one kind of processor. These are systems that gain performance not just by adding the same type of processors, but by adding dissimilar processors, usually incorporating specialized processing capabilities to handle particular tasks.[1]
Usually heterogeneity in the context of computing referred to different instruction set architectures (ISA), where the main processor has one and the rest have another, usually a very different architecture (maybe more than one), not just a different microarchitecture (floating point number processing is a special case of this not usually referred to as heterogeneous). E.g. ARM big.LITTLE is an exception where the ISAs of cores are the same and heterogeneity refers to the speed of different microarchitectures of the same ISA, then making it more like a symmetric multiprocessor system (SMP).
In the past heterogeneous computing meant different ISAs had to be handled differently, while a modern example, Heterogeneous System Architecture (HSA) systems,[2] eliminate the difference (for the user); use multiple processor types (typically CPUs and GPUs), usually on the same integrated circuit, to give you the best of both worlds: general GPU processing (apart from its well-known 3D graphics rendering capabilities, can also perform mathematically intensive computations on very large data sets), while CPUs can run the operating system and perform traditional serial tasks.
The level of heterogeneity in modern computing systems gradually rises as increases in chip area and further scaling of fabrication technologies allows for formerly discrete components to become integrated parts of a system-on-chip, or SoC. For example, many new processors now include built-in logic for interfacing with other devices (SATA, PCI, Ethernet, USB, RFID, Radios, UARTs, and memory controllers), as well as programmable functional units and hardware accelerators (GPUs, cryptography co-processors, programmable network processors, A/V encoders/decoders, etc.).
Recent findings show that a heterogeneous-ISA chip multiprocessor that exploits diversity offered by multiple ISAs, can outperform the best same-ISA heterogeneous architecture by as much as 21% with 23% energy savings and a reduction of 32% in Energy Delay Product.[3] The recent announcement by AMD on its pin-compatible ARM and x86 SoCs, codename Project Skybridge, suggests a heterogeneous-ISA (ARM+x86) chip multiprocessor in the making.[4]
Challenges in heterogeneous computing
Heterogeneous computing systems present new challenges not found in typical homogeneous systems.[5] The presence of multiple processing elements raises all of the issues involved with homogeneous parallel processing systems, while the level of heterogeneity in the system can introduce non-uniformity in system development, programming practices, and overall system capability. Areas of heterogeneity can include:[6]
- ISA or instruction set architecture
- Compute elements may have different instruction set architectures, leading to binary incompatibility.
- ABI or application binary interface
- Compute elements may interpret memory in different ways. This may include both endianness, calling convention, and memory layout, and depends on both the architecture and compiler being used.
- API or application programming interface
- Library and OS services may not be uniformly available to all compute elements.[7]
- Low-Level Implementation of Language Features
- Language features such as functions and threads are often implemented using function pointers, a mechanism which requires additional translation or abstraction when used in heterogeneous environments.
- Memory Interface and Hierarchy
- Compute elements may have different cache structures, cache coherency protocols, and memory access may be uniform or non-uniform memory access (NUMA). Differences can also be found in the ability to read arbitrary data lengths as some processors/units can only perform byte-, word-, or burst accesses.
- Interconnect
- Compute elements may have differing types of interconnect aside from basic memory/bus interfaces. This may include dedicated network interfaces, Direct memory access (DMA) devices, mailboxes, FIFOs, and scratchpad memories, etc. Furthermore, certain portions of a heterogeneous system may be cache-coherent, whereas others may require explicit software-involvement for maintaining consistency and coherency.
- Performance
- A heterogeneous system may have CPUs that are identical in terms of architecture, but have underlying micro-architectural differences that lead to various levels of performance and power consumption.
Example platforms
Heterogeneous computing platforms can be found in every domain of computing—from high-end servers and high-performance computing machines all the way down to low-power embedded devices including mobile phones and tablets.
- High Performance Computing
- Cray XD1
- SRC Computers SRC-6 and SRC-7
- Embedded Systems (DSP and Mobile Platforms)
- Reconfigurable Computing
- Networking
- Intel IXP Network Processors
- Netronome NFP Network Processors
- General Purpose Computing, Gaming, and Entertainment Devices
- Intel Sandy Bridge, Ivy Bridge, and Haswell CPUs
- AMD APUs
- IBM Cell, found in the Playstation 3[8]
- SpursEngine, a variant of the IBM Cell processor
- Emotion Engine, found in the Playstation 2
See also
- GPGPU
References
- ↑ Shan, Amar (2006). Heterogeneous Processing: a Strategy for Augmenting Moore's Law. Linux Journal.
- ↑ "Hetergeneous System Architecture (HSA) Foundation".
- ↑ Venkat, Ashish; Tullsen, Dean M. (2014). Harnessing ISA Diversity: Design of a Heterogeneous-ISA Chip Multiprocessor. Proceedings of the 41st Annual International Symposium on Computer Architecture.
- ↑ "AMD Announces Project SkyBridge: Pin-Compatible ARM and x86 SoCs in 2015, Android Support".
- ↑ Kunzman, D.M. (2011). Programming Heterogeneous Systems. International Symposium on Parallel and Distributed Processing Workshops.
- ↑ Flachs, Brian (2009). Bringing Heterogeneous Processors Into The Mainstream,. Symposium on Application Accelerators in High-Performance Computing (SAAHPC).
- ↑ Agron, Jason; Andrews, David (2009). Hardware Microkernels for Heterogeneous Manycore Systems. Parallel Processing Workshops, 2009. International Conference on Parallel Processing (ICPPW).
- ↑ Gschwind, Michael (2005). A novel SIMD architecture for the Cell heterogeneous chip-multiprocessor. Hot Chips: A Symposium on High Performance Chips.