AMD K8L
From Wikipedia, the free encyclopedia
The AMD K8L microarchitecture is the immediate successor to the AMD K8 series of processors (Athlon 64, Opteron, Sempron 64 respectively, and sharing technologies with the Socket S1 Turion 64 processors) from AMD. This microarchitecture implements the AMD64 and IA-32 instruction sets.
Contents |
[edit] Nomenclature
The name K8L is used by the wider IT community as a convenient shorthand along with K10 and Stars,[1] while according to AMD official documents, it is termed "AMD Next Generation Processor Technology".[2] It was reported that the codename K8L actually referred to a low-power version of the K8 chip, later named Turion 64, and that K10 was the official internal codename for the microarchitecture.[3]
[edit] Schedule of launch and delivery
[edit] Timeline
On April 13, 2006, Henri Richard, AMD executive vice president and chief officer for marketing and sales, acknowledged[4] the existence of the new microarchitecture in an interview.
On July 21, 2006, AMD President and COO Dirk Meyer and Senior VP Marty Seyer confirmed that the launch date of new microprocessors of Revision H under the new microarchitecture is slated for mid-2007; and that it will contain a quad core version for servers, workstations, and high-end desktops, as well as a dual core version for consumer Desktops. Some of the Revision H Opterons shipped in 2007 will have a thermal design power of 68W.
On August 15, 2006, at the launch of the first Socket F (also known as Socket 1207) dual core Opterons, AMD announced that the firm has reached the final design stage (tape-out) of quad-core Opteron parts, codenamed Deerhound. The next stages are testing and validation, with sampling to follow after several months.[5]
As of November 2006, reports leaked the up-coming desktop part codenames Agena, Agena FX,[6] and the core speeds of the parts range from 2.4 GHz - 2.9 GHz respectively, 512 KiB L2 cache each core, 2 MiB L3 cache, using HyperTransport 3.0, with a TDP of 125 W.[7] In recent reports, single core variants (codenamed Spica) and dual core with or without L3 cache (codenamed Kuma and Rana respectively) are available.[8] variants under the same microarchitecture.[9]
During the AMD Analyst Day 2006 on December 14, 2006, AMD announced their official timeline for server, desktop and mobile processors.[10] For the servers segment, AMD will unveil two new processors based on the architecture codenamed "Barcelona" and "Budapest" or more than 1-way and 1-way servers.[10] Desktops will see an overhaul of the entire processor lineup. Single-core Lima built in 65 nm for the single-core Athlon 64 will arrive in Q1 2007 while Sparta, the Sempron 65 nm update, will come in Q2 2007. For the second half of 2007, HyperTransport 3.0 and Socket AM2+ will be unveiled, which are designed for the specific implementation of the aforementioned consumer quad core desktop chip series, with naming convention changes from city-names (up to mid 2007) to stars/constellations (after mid 2007), such as Agena; in addition, the 4x4 platform and its immediate successor will support the high end enthusiast DP versions of the chip, such as Agena FX.[11] As with the Barcelona server chips, the new desktop quad core series will feature a shared L3 cache, 128-bit Floating point (FP) units and an enhanced microarchitecture. Agena, the native quad-core processor for the desktop, will become the Athlon 64 X4 and a special version called Agena FX will update the Athlon FX line for AMD Quad FX platform. Kuma, a dual-core variant will follow on in Q3 while Rana, the dual-core version with no shared L3 cache is expected at the end of the year [12].
Processor model information has been reported as follows:[13]
Model | Clock rate | Codename | TDP |
---|---|---|---|
Athlon 64 X2 1900 | 1.9 GHz | Kuma | 65 W |
Athlon 64 X2 2100 | 2.1 GHz | Kuma | 65 W |
Athlon 64 X2 2300 | 2.3 GHz | Kuma | 65 W |
Athlon 64 X2 2500 | 2.5 GHz | Kuma | 89 W |
Athlon 64 X2 2700 | 2.7 GHz | Kuma | 89 W |
Athlon 64 X2 2900 | 2.9 GHz | Kuma | 89 W |
Athlon 64 X4 1900 | 1.9 GHz | Agena | 95 W |
Athlon 64 X4 2100 | 2.1 GHz | Agena | 95 W |
Athlon 64 X4 2300 | 2.3 GHz | Agena | 120 W |
Athlon 64 X4 2500 | 2.5 GHz | Agena | 120 W |
Unknown Athlon 64 FX | 2.5 GHz | Agena FX | 120 W |
Opteron 1266 | 2.1 GHz | Barcelona | 95 W |
Opteron 1268SE | 2.3 GHz | Barcelona | 120 W |
Opteron 1270SE | 2.5 GHz | Barcelona | 120 W |
[edit] Live demonstrations
On November 30, 2006, AMD live demoed the native quad core chip known as "Barcelona" for the first time in public,[14] while running Windows Server 2003 64-bit Edition. AMD claims 70% scaling of performance in real world loads, and better performance than Intel Xeon 5355 Clovertown.[15] More details regarding this first revision of the next generation AMD microprocessor design have surfaced on the web recently including their clock speeds.[16][17]
On January 24, 2007, AMD Executive VP Randy Allen claimed that in live tests, in regard to a wide variety of workloads, "Barcelona" was able to demonstrate 40% performance advantage over the comparable Intel Clovertown DP quad-core chips.[18] The expected performance of floating point per core would be approximately 1.8 times that of the K8 family, at the same clock speed.[19]
[edit] Sister microarchitecture
Also due in a similar timeframe will be a sister microarchitecture, which will focus on lower power consumption chips in mobile platforms as well as small form factor features. This microarchitecture will contain specialized features such as separate power planes for cores and other on die components; mobile optimized crossbar switch and memory controller; link power management for HyperTransport 3.0; and others. At this time, AMD has simply dubbed it "New Mobile Core", without giving a specific codename. This will represent a change in design philosophy from a singular x86 microarchitecture covering from servers through laptops, making differences in design between different platforms easier.
On the December 2006 analyst day, Executive VP Marty Seyer announced the new mobile core code-named "Griffin" to be launched in 2008.[20]
[edit] Iterations of the release
In late 2007 to Q2 2008, there will be a modification to the core to be fabricated at 45 nm node,[21] with enhancements such as FB-DIMM support, Direct Connect Architecture 2.0, enhanced RAS, and probably more for the processor die. The K8L platform will also add support for I/O Virtualization, PCI Express 2.0, 10 Gigabit NIC, and more.
However, reports have suggested that FB-DIMM support has been dropped from future roadmaps of the majority of AMD products since popularity is low.[22][23] Also, FB-DIMM's future as an industry standard has been called into question.
A recent article published by The Inquirer corroborates the earlier reports of the timeline (as cited in this article). According to this report, there will be three iterations of the core: one named Barcelona, due in Q2 of 2007, with new CPU core components and microarchitecture, but built on the old HyperTransport 2.0 infrastructure; the second is Budapest for single socket systems (single sockets are AM2/AM3), with HyperTransport 3.0; and the third, codenamed Shanghai is an update of the server chip, based on 45 nm process,[24] probably also with HyperTransport 3.0 and DDR3 implementation, due in Q1-Q2 2008.[25]
[edit] Probable features
[edit] Fabrication technology
AMD will introduce the microprocessors manufactured at 65 nm feature width using Silicon-on-insulator (SOI) technology, since the release of K8L coincides with the volume ramp of this manufacturing process.[26] The servers will be produced for Socket F (1207) infrastructure, the only server socket on AMD's near-term roadmap; the desktop parts will come on Socket AM2 or Socket AM2+.
AMD announced during the 2006 Technology Analyst Day that the use of CTT and STI will finally lead to the implementation of Silicon-Germanium-on-Insulator (SGOI) on 65 nm process CPUs.[27]
[edit] Supported DRAM standards
The K8 family is particularly sensitive to memory latency since its design gains performance by minimizing this through the use of an on-die memory controller (integrated into the CPU); increased latency in the external modules negates the usefulness of this feature. DDR2 introduces some additional latency over traditional DDR since the DRAM is internally driven by a clock at one quarter of the external data frequency, as opposed to one half for DDR. However, since the command clock rate in DDR2 is doubled relative to DDR and other latency-reducing features (e.g. additive latency) have been introduced, common comparisons based on CAS Latency alone are not sufficient. For example, Socket AM2 processors are known to demonstrate similar performance using DDR2 SDRAM as Socket 939 processors that utilize DDR-400 SDRAM. K8L processors are expected to support DDR2 SDRAM rated up to DDR2-1066.[28]
[edit] Higher computational throughput
It was also reported by several sources (such as AnandTech, The Inquirer and Geek.com) that the microprocessors implementing the microarchitecture will feature a doubling in the width of SSE execution units in the cores. With the help of major improvements in the memory subsystem (such as load re-ordering and improved prefetch mechanisms) as well as the doubled instruction fetch and load, it is expected to increase the suitability of the processor to scientific and high-performance computing tasks and potentially improve its competitiveness with Intel's Xeon, Core 2, Itanium 2 and other contemporary microprocessors.
Many of the improvements in computational throughput of each core are listed in the section below.
[edit] Characteristics of the microarchitecture
- Form factors
- Socket AM2+ for Athlon 64 X2 and Athlon 64 X4 processors as well as single-socket Opterons and Socket F+ for Athlon 64 FX processors and multi-socket Opterons, supporting HyperTransport 3.0 with the use of DDR2 DIMMs.[30]
- Backward-compatible with existing Socket AM2 and Socket F motherboards.
- Instruction set additions and extensions
- New bit-manipulation instructions: Leading Zero Count (LZCNT) and Pop Count (POPCNT)
- New SSE instructions named as SSE4A: combined mask-shift instructions (EXTRQ/INSERTQ) and scalar streaming store instructions(MOVNTSD/MOVNTSS)
- Support for unaligned SSE load-operation instructions (which formerly required 16-byte alignment)[31]
- Execution pipeline enhancements
- 128-bit wide SSE units
- Wider L1-D interface allowing for two 128-bit loads per cycle (as opposed to two 64-bit loads per cycle with K8)
- Lower integer divide latency
- 512-entry indirect branch predictor and a larger return stack (size doubled from K8) and branch target buffer
- Side-Band Stack Optimizer, dedicated to perform increment/decrement of register stack pointer
- Fastpathed CALL and RET-Imm instructions (formerly microcoded) as well as MOVs from SIMD registers to general purpose registers
- Integration of new technologies onto CPU die:
- Four processor cores (Quad-core)
- Split power planes, first dubbed Dynamic Independent Core Engagement or D.I.C.E. by AMD and now known as Enhanced Cool 'n' Quiet, allowing the cores and northbridge (integrated memory controller) to scale power consumption up or down independently.[32]
- Improvements in the memory subsystem:
- Improvements in access latency:
- Support for re-ordering loads ahead of other loads and stores
- More aggressive instruction prefetching (32 bytes as opposed to 16 bytes in K8)
- DRAM prefetcher for buffering reads
- Buffered burst writeback to RAM in order to reduce contention
- Changes in memory hierarchy:
- Prefetch directly into L1 cache (as opposed to L2 cache with K8)
- 32-way set associative L3 victim cache sized at least 2 MiB, shared between processing cores on a single die (each with 512 KiB of independent exclusive L2 cache), with a sharing-aware replacement policy.
- Extensible L3 cache design, with 6 MiB planned for 45nm node, with the chips code named "Shanghai".
- Changes in address space management:
- Two 64-bit independent memory controllers, each with its own physical address space; this provides an opportunity to better utilize the available bandwidth in case of random memory accesses occurring in heavily multi-threaded environments. This approach is in contrast to the previous "interleaved" design, where the two 64-bit data channels were bounded to a single common address space.
- Larger TLBs; support for 1 GiB page entries and a new 128-entry 2 MiB page TLB
- 48-bit memory addressing to allow for 256 TiB memory subsystems
- Memory mirroring, data poisoning support and Enhanced RAS
- Nested page tables for AMD-V, decreasing world switch time by 25%.
- Improvements in access latency:
- Improvements in system interconnect:
- HyperTransport retry support
- Support for HyperTransport 3.0, with HyperTransport Link unganging which creates 8 point-to-point links per socket.
- Platform-level enhancements with additional functionality:
- Split power planes for CPU core and memory controller/northbridge for more effective power management
- Five p-states allowing for automatic clock rate modulation
- Increased clock gating
- Official support for coprocessors via HTX slots and vancant CPU sockets through HyperTransport: Torrenza initiative.
[edit] Media discussions
Note: These media discussions are sorted by dates of publishing in ascending orders.
- "AMD CTO speaks about future AMD technologies", AnandTech, October 14, 2005.
- "AMD outlines Future Goals (mostly non-specific at this time)", TechReport, October 17, 2005.
- "AMD eyes Z-RAM for dense caches", CNet News.com, January 20, 2006.
- "AMD licenses Z-RAM", SlashDot, January 21, 2006.
- "AMD's K8L to double FPU units in 2007", Geek.com, February 24, 2006.
- "Rev G. and H. AMD64 chips Preliminary information", The Inquirer, March 3, 2006.
- "Interview with Henri Richard (Part 2)", DigiTimes, March 14, 2006.
- "AMD demonstrates Hardware Coprocessor Offload", LinuxElectrons, 20 March 2006.
- "Implementation of FPGA through coherent HTT", The Inquirer, March 26, 2006.
- "AMD's K8L 65 nm core due H1 07", Reg Hardware, April 4, 2006.
- "An AMD Update: Fab 36 Begins Shipments, Planning for 65 nm and AM2 Performance", AnandTech, April 4, 2006.
- "Fab36 substantially converted to 65 nm by mid-2007", AnandTech, April 4, 2006.
- "AMD shows off details of K8L", The Inquirer, May 16, 2006.
- "AMD's K8L and 4x4 Preview", RealWorldtech, June 2, 2006.
- "AMD K8L and 4X4 Technologies", ArsTechnica, June 2, 2006.
- "AMD Quad-Core K8L & 4x4 Details", Pure OverClock, June 3, 2006.
- "Socket AM2 Forward Compatible With AM3 CPUs", DailyTech, July 6, 2006.
- "K8L on schedule, due for release as early as Q1 07", The Inquirer, July 11, 2006.
- "GNU binutils support for the new K8L instructions", SourceWare.org, July 13, 2006.
- "AMD Executives Confirm K8L to Arrive in Mid-2007", X-bit labs, July 21, 2006.
- "AMD To Demo K8L By Year End", moneycontrol.com, July 23, 2006.
- "AMD intros new Opterons and promises 68 W quad-core CPUs", tgdaily.com, August 15, 2006.
- "Next-Generation AMD Opteron Paves The Way For Quad-Core", crn.com, August 15, 2006.
- "AMD's Next Generation Microarchitecture Preview: from K8 to K8L", X-bit labs, August 21, 2006.
- "AMD quad cores: the whole story unfolded", The Inquirer, September 16, 2006.
- "AMD reinvents the x86", InfoWorld, February 7, 2007.
[edit] References
- ^ Valich, Theo. "AMD explains K8L misnomer", The Inquirer. Retrieved on 2007-03-16.
- ^ Official Announcement of "AMD Next Generation Processor Technology"
- ^ The Inquirer report
- ^ Hall, Chris. Re-defining microprocessors: Q&A with AMD’s Henri Richard. DigiTimes.com. Retrieved on 2007-03-18.
- ^ "Next-Generation AMD Opteron Paves The Way For Quad-Core", crn.com, 15 August 2006.
- ^ "AMD processor roadmaps for 2007", Tracking AMD, 31 December 2006.
- ^ "AMD Quad-Core Altair upcoming in 2007 Q3", HKEPC, 3 October 2006.
- ^ "AMD processor roadmaps for 2007", Tracking AMD, 31 December 2006.
- ^ "AMD to enter K8L era in 2H 2007", HKEPC, 4 October 2006.
- ^ a b "06A-DayMartySeyer.pdf 2006 Analyst Day Slides (Roadmaps for server and mobile", AMD.
- ^ "AMD processor roadmaps for 2007", Tracking AMD, 31 December 2006.
- ^ "AMD processor roadmaps for 2007", Tracking AMD, 31 December 2006.
- ^ Pullen, Dean. "Further AMD next-gen specs roll out", The Inquirer. Retrieved on 2007-03-16.
- ^ "AMD Demonstrates Its Quad Core Server Chips", CNET.com, 30 November 2006.
- ^ "AMD Demonstrates Barcelona; The First True, Native Quad Core Opteron", legitreviews.com, 30 November 2006.
- ^ "Quick Look at AMD Quad Core Barcelona", arstechnica.com.
- ^ The Inquirer article
- ^ "AMD Expects Quad Core Barcelona to Outperform Clovertown by 40%", dailytech.com, 25 January 2007.
- ^ "Go to 'Barcelona' over 'Cloverton'", CNET.com, 23 January 2007.
- ^ "AMD updates Opteron, Turion roadmaps", informationweek.com, 14 December 2006.
- ^ "AMD Outlines Quad Core Computing", www.pcpro.co.uk, 19 September 2006.
- ^ "Intel Pulls Back from FB-DIMM", inquirer.net, 7 September 2006.
- ^ "No Shocker Here", legitreviews.com, 15 September 2006.
- ^ DailyTech report
- ^ "AMD Quad Cores: The Whole Story Unfolded", inquirer.net, 16 September 2006.
- ^ "An AMD Update: Fab 36 Begins Shipments, Planning for 65 nm and AM2 Performance", AnandTech, April 4, 2006.
- ^ Ostrander, Daryl. 2006 Technology Analyst Day. Advanced Micro Devices. Retrieved on 2007-03-19.
- ^ "AMD’s next-generation Star supports DDR2-1066 & SSE4A", HKEPC Hardware. Retrieved on 2007-03-19.
- ^ Shimpi, Anand. "Barcelona Architecture: AMD on the Counterattack", AnandTech. Retrieved on 2007-03-18.
- ^ "AMD Quad-Core Altair upcoming in 2007 Q3", HKEPC, 3 October 2006.
- ^ Case, Loyd. "AMD Unveils Barcelona Quad-Core Details", Ziff Davis. Retrieved on 2007-03-18.
- ^ "AMD Next Generation Processor Technology Slides", HardOCP, 22 August 2006.
[edit] External links
- AMD Official Website
- AMD Quad-core processors introduction
- Slides of AMD 2006 Technology Analyst Day: Official Introduction of K8L Microarchitecture (PDF file)
- Next-Generation AMD Opteron™ Processors Introduced with Record OEM Design Wins and Native Quad-Core Upgrade Path (Official AMD press release on 15 August 2006)