Puma (microarchitecture)
Produced | From mid-2014 to present |
---|---|
Common manufacturer(s) | |
Max. CPU clock rate | 1.35 GHz to 2.5 GHz |
Min. feature size | 28 nm |
Instruction set | AMD64 (x86-64) |
Cores | 2–4 |
L1 cache | 64 KB per core[1] |
L2 cache | 1 MB to 2 MB shared |
Socket(s) |
|
Predecessor | Jaguar - Family 16h |
GPU | Radeon Rx: 128 cores, 300–800 Mhz |
Core name(s) |
|
Brand name(s) |
The Puma Family 16h is a low-power microarchitecture by AMD for its APUs. It succeeds the Jaguar as a second-generation version, targets the same market, and belongs to the same AMD architecture Family 16h. The Beema line of processors are aimed at low-power notebooks, and Mullins are targeting the tablet sector.
Design
The Puma cores use the same microarchitecture as Jaguar, and inherits the design:
- Out-of-order execution and Speculative execution, up to 4 CPU cores
- Two-way integer execution
- Two-way 128-bit wide floating-point and packed integer execution
- Integer hardware divider
- Puma does not feature clustered multi-thread (CMT), meaning that there are no "modules"
- Puma does not feature Heterogeneous System Architecture or zero-copy[2]
- 32 KiB instruction + 32 KiB data L1 cache per core
- 1–2 MiB unified L2 cache shared by two or four cores
- Integrated single channel memory controller supporting 64bit DDR3L
- 3.1 mm2 area per core
Instruction set support
Like Jaguar, the Puma core has support for the following instruction sets and instructions: MMX, SSE, SSE2, SSE3, SSSE3, SSE4a, SSE4.1, SSE4.2, AVX, F16C, CLMUL, AES, BMI1, MOVBE (Move Big-Endian instruction), XSAVE/XSAVEOPT, ABM (POPCNT/LZCNT), and AMD-V.[1]
Improvements over Jaguar
- 19% CPU core leakage reduction at 1.2V[3]
- 38% GPU leakage reduction
- 500 mW reduction in memory controller power
- 200 mW reduction in display interface power
- Chassis temperature aware turbo boost[4]
- Selective boosting according to application needs (intelligent boost)
- Support for ARM TrustZone via integrated Cortex-A5 processor
- Support for DDR3L-1866 memory[5]
Puma+
AMD released a revision of Puma core, Puma+, as a part of the Carrizo-L platform in 2015. The differences in the CPU microarchitecture are unclear. Puma+ featured 2 or 4 cores up to 2.5GHz and required the newer FP4 socket.[6]
Features and ASICs
Brand | Llano | Trinity | Richland | Kaveri | Carrizo | Bristol Ridge | Raven Ridge | Desna, Ontario, Zacate | Kabini, Temash | Beema, Mullins | Carrizo-L | Stoney Ridge | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Platform | Desktop, Mobile | Mobile | Desktop, Mobile | Ultra-mobile | |||||||||
Released | Aug 2011 | Oct 2012 | Jun 2013 | Jan 2014 | Jun 2015 | Jun 2016 | TBA | Jan 2011 | May 2013 | Q2 2014 | May 2015 | June 2016 | |
Fab. (nm) | GlobalFoundries 32 SOI | 28 | 14 | TSMC 40 | 28 | ||||||||
Die size (mm2) | 228 | 246 | 245 | 244.62 | 250.04 | TBA | 75 (+ 28 FCH) | ~107 | TBA | 125 | |||
Socket | FM1, FS1 | FM2, FS1+, FP2 | FM2+, FP3 | FM2+[lower-alpha 1], FP4 | AM4, FP4 | AM4, FP5 | FT1 | AM1, FT3 | FT3b | FP4 | FP4 | ||
CPU architecture | AMD 10h | Piledriver | Steamroller | Excavator | Zen | Bobcat | Jaguar | Puma | Puma+[7] | Excavator | |||
Memory support | DDR3-1866 DDR3-1600 DDR3-1333 | DDR3-2133 DDR3-1866 DDR3-1600 DDR3-1333 | DDR4-2400 DDR4-2133 DDR4-1866 DDR4-1600 | DDR3L-1333 DDR3L-1066 | DDR3L-1866 DDR3L-1600 DDR3L-1333 DDR3L-1066 | DDR3L-1866 DDR3L-1600 DDR3L-1333 | Up to DDR4-2133 | ||||||
3D engine[lower-alpha 2] | TeraScale (VLIW5) | TeraScale (VLIW4) | GCN 2nd Gen (Mantle, HSA) | GCN 3rd Gen (Mantle, HSA) | GCN 5th Gen[8] (Mantle, HSA) | TeraScale (VLIW5) | GCN 2nd Gen | GCN 3rd Gen[8] | |||||
Up to 400:20:8 | Up to 384:24:6 | Up to 512:32:8 | TBA | 80:8:4 | 128:8:4 | Up to 192:?:? | |||||||
IOMMUv1 | IOMMUv2 | IOMMUv1[9] | TBA | TBA | |||||||||
Unified Video Decoder | UVD 3 | UVD 4.2 | UVD 6 | TBA | UVD 3 | UVD 4 | UVD 4.2 | UVD 6 | UVD 6.3 | ||||
Video Coding Engine | N/A | VCE 1.0 | VCE 2.0 | VCE 3.1 | TBA | N/A | VCE 2.0 | VCE 3.1 | |||||
GPU power saving | PowerPlay | PowerTune | N/A | PowerTune[10] | |||||||||
Max. displays[lower-alpha 3] | 2–3 | 2–4 | 2–4 | 3 | 4 | TBA | 2 | TBA | TBA | ||||
TrueAudio | N/A | [12] | N/A[9] | TBA | |||||||||
FreeSync | N/A | N/A | TBA | ||||||||||
/drm/radeon [13][14] |
N/A | N/A | |||||||||||
/drm/amdgpu [15] |
N/A | [16] | N/A | [16] |
- ↑ No APU models. Athlon X4 845 only.
- ↑ Unified shaders : texture mapping units : render output units
- ↑ To feed more than two displays, the additional panels must have native DisplayPort support.[11] Alternatively active DisplayPort-to-DVI/HDMI/VGA adapters can be employed.
Processors
Desktop/Mobile (Beema)
Family | Model | Socket | CPU | GPU | TDP | Memory | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
Cores | Frequency | Max. Turbo | L2 Cache | Model | Config. | Max. Freq. | |||||
A8 | 6410 | Socket FT3b | 4 | 2.0 GHz | 2.4 GHz | 2 MB | Radeon R5 | 128:?:? | 800 MHz | 15 W | DDR3L-1866 |
A6 | 6310 | 1.8 GHz | Radeon R4 | 800 MHz | |||||||
A4 | 6250J | 2.0 GHz | N/A | Radeon R3 | 600 MHz | 25 W | DDR3L-1600 | ||||
A4 | 6210 | 1.8 GHz | Radeon R3 | 600 MHz | 15 W | ||||||
E2 | 6110 | 1.5 GHz | Radeon R2 | 500 MHz | |||||||
E1 | 6010 | 2 | 1.35 GHz | 1 MB | 350 MHz | 10 W | DDR3L-1333 |
Tablet (Mullins)
Family | Model | CPU | GPU | Power | Memory | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Cores | Frequency | Max. Turbo | L2 Cache | Model | Config. | Max. Freq. | TDP | SDP | |||
A10 Micro | 6700T | 4 | 1.2 GHz | 2.2 GHz | 2 MB | Radeon R6 | 128:?:? | 500 MHz | 4.5 W | 2.8 W | DDR3L-1333 |
A6 Micro | 6500T | 1.8 GHz | Radeon R4 | 401 MHz | |||||||
A4 Micro | 6400T | 1.0 GHz | 1.6 GHz | Radeon R3 | 350 MHz | ||||||
E1 Micro | 6200T | 2 | 1.4 GHz | 1 MB | Radeon R2 | 300 MHz | 3.95 W | DDR3L-1066 |
References
- 1 2 "Software Optimization Guide for Family 16h Processors". AMD. Retrieved August 3, 2013.
- ↑ "AMD launches new Beema, Mullins SoCs". ExtremeTech. 2014-04-29. Retrieved 2014-05-02.
- ↑ Shimpi, Anand. "AMD Beema/Mullins Architecture & Performance Preview". AnandTech. Retrieved 29 April 2014.
- ↑ Shimpi, Anand. "New Turbo Boost, The Lineup and Trustzone". AnandTech. Retrieved 29 April 2014.
- ↑ Woligroski, Don. "Meet The Mullins And Beema Tablet APUs". Toms Hardware. Retrieved 29 April 2014.
- ↑ Cutress, Ian (12 May 2015). "AMD's Carrizo-L APU Unveiled". Anandtech. Retrieved 14 January 2017.
- ↑ "AMD Mobile “Carrizo” Family of APUs Designed to Deliver Significant Leap in Performance, Energy Efficiency in 2015" (Press release). 2014-11-20. Retrieved 2015-02-16.
- 1 2 "AMD VEGA10 and VEGA11 GPUs spotted in OpenCL driver". VideoCardz.com. Retrieved 6 June 2017.
- 1 2 Thomas De Maesschalck (2013-11-14). "AMD teases Mullins and Beema tablet/convertibles APU". Retrieved 2015-02-24.
- ↑ Tony Chen; Jason Greaves, "AMD's Graphics Core Next (GCN) Architecture" (PDF), AMD, retrieved 2016-08-13
- ↑ "How do I connect three or More Monitors to an AMD Radeon™ HD 5000, HD 6000, and HD 7000 Series Graphics Card?". AMD. Retrieved 2014-12-08.
- ↑ "A technical look at AMD’s Kaveri architecture". Semi Accurate. Retrieved 6 July 2014.
- ↑ Airlie, David (2009-11-26). "DisplayPort supported by KMS driver mainlined into Linux kernel 2.6.33". Retrieved 2016-01-16.
- ↑ "Radeon feature matrix". freedesktop.org. Retrieved 2016-01-10.
- ↑ Deucher, Alexander (2015-09-16). "XDC2015: AMDGPU" (PDF). Retrieved 2016-01-16.
- 1 2 Michel Dänzer (2016-11-17). "[ANNOUNCE] xf86-video-amdgpu 1.2.0". lists.x.org.
External links
- Software Optimization Guide for Family 16h Processors
- 2014 AMD Low-Power Mobile APUs
- Jaguar presentation (video) at ISSCC 2013
- Discussion initiated on RWT forums by Jeff Rupley, Chief Architect of the Jaguar core
- BKDG for Family 16h Models 00h-0Fh Processors
- Revision Guide for Family 16h Models 00h-0Fh Processors (Jaguar)
- Revision Guide for Family 16h Models 30h-3Fh Processors (Puma)