Radeon R300
From Wikipedia, the free encyclopedia
The Radeon "R300" architecture (introduced August 2002) was ATI's first DirectX 9.0 design of graphics processing unit. R300 was the world's first fully DirectX 9 capable consumer graphics chip, and would outperform its initial competitors on a scale not seen since the GeForce 256 in 1999. R300 and its derivatives would form the basis for ATI's consumer and professional product lines for over 3 years.
Its integrated version is Xpress 200. The main difference is that it uses the motherboard's memory instead of its own video RAM.
Contents |
[edit] Development
ATI had held the lead for a while with the Radeon 8500 but NVIDIA retook the performance crown with the launch of the GeForce 4 Ti line. A new high-end refresh part, the 8500XT (R250) was supposedly in the works, ready to compete against NVIDIA’s high-end offerings, particularly the top line Ti 4600. Pre-release information listed a 300 MHz core and RAM clock speed for the "R250" chip. ATI, perhaps mindful of what had happened to 3dfx when they took focus off their "Rampage" processor, abandoned it in favor of finishing off their next-generation R300 card.
The R3xx chip was designed by ATI's west coast team (formerly ArtX Inc.), and the first product to use it was the Radeon 9700 PRO (internal ATI code name: R300 - internal ArtX codename: Khan), launched in August 2002. The architecture of R300 was quite different from its predecessor, Radeon 8500 ("R200"), in nearly every way. The core of 9700 PRO was manufactured on a 150 nm chip fabrication process, similar to the Radeon 8500. However, refined design and manufacturing techniques enabled a doubling of transistor count and a significant clock speed gain. One major change with the manufacturing of the core was the use of the flip chip packaging, a technology not used previously on video cards. Flip chip packaging allows far better cooling of the die by flipping it and exposing it directly to the cooling solution, and thus enables higher clock speeds to be easier achieved. Radeon 9700 PRO was launched clocked at 325 MHz, ahead of the originally projected 300 MHz. With a transistor count of 110 million, it was the largest and most complex GPU of the time. A slower chip, the 9700, was launched a few months later, differing only by lower core and memory speeds. Surprisingly, the Radeon 9700 PRO was clocked significantly higher than the Matrox Parhelia 512, a card released but months before R300 and considered to be the pinnacle of graphics chip manufacturing (with 80 million transistors at 220 MHz), up until R300's arrival.
[edit] Architecture
The chip adopted an architecture consisting of 8 pixel pipelines, each with 1 texture mapping unit (an 8x1 design). While this differed from the older chips using 2 texture units per pipeline, this did not mean "R300" could not perform multitexturing as efficiently as older chips. Its texture units could perform a new "loopback" operation which allowed them to sample up to 16 textures per geometry pass. The textures can be any combination of one, two, or three dimensions with bilinear, trilinear, or anisotropic filtering. This was part of the new DirectX 9 specification, along with more flexible floating-point-based Shader Model 2.0+ pixel shaders and vertex shaders. Equipped with 4 vertex shader units, "R300" possessed over twice the geometry processing capability of its predecessor and the GeForce4 Ti 4600, in addition to the greater feature-set offered compared to DirectX 8 shaders.
A noteworthy limitation is that all "R300"-generation chips were designed for a maximum floating point precision of 96-bit, or FP24, instead of DirectX 9's maximum of 128-bit FP32. DirectX 9.0 specified FP24 as a minimum level for conforming to the specification for full precision. This tradeoff in precision offered the best combination of transistor usage and image quality for the manufacturing process at the time. It did cause a usually visibly-imperceptible loss of quality when doing heavy blending. ATI's Radeon chips did not go above FP24 until R520. ATI demonstrated part of what was capable with pixel shader PS2.0 with their Rendering with Natural Light demo. The demo was a real-time implementation of noted 3D graphics researcher Paul Debevec's paper on the topic of high dynamic range rendering.[1]
The "R300" was the first board to truly take advantage of a 256-bit memory bus. Matrox had released their Parhelia 512 several months earlier, but this board did not show great gains with its 256-bit bus. ATI, however, had not only doubled their bus to 256-bit, but also integrated an advanced crossbar memory controller, somewhat similar to NVIDIA's memory technology. Utilizing four individual load-balanced 64-bit memory controllers, ATI's memory implementation was quite capable of achieving high bandwidth efficiency by maintaining adequate granularity of memory transactions and thus working around memory latency limitations. "R300" was also given the latest refinement of ATI's innovative HyperZ memory bandwidth and fillrate saving technology, HyperZ III. The demands of the 8x1 architecture required more bandwidth than the 128-bit bus designs of the previous generation due to having double the texture and pixel fillrate.
Radeon 9700 introduced ATI's multi-sample gamma-corrected anti-aliasing scheme. The chip offered sparse-sampling in modes including 2X, 4X, and 6X. Multi-sampling offered vastly superior performance over the supersampling method on older Radeons, and superior image quality compared to NVIDIA's offerings at the time. Anti-aliasing was, for the first time, a fully usable option even in the newest and most demanding titles of the day. "R300" also offered advanced anisotropic filtering which incurred a much smaller performance hit than the anisotropic solution of the GeForce4 and other competitors' cards, while offering significantly improved quality over Radeon 8500's anisotropic filtering implementation which was highly angle dependent.
[edit] Performance
Radeon 9700's advanced architecture was very efficient and, of course, more powerful compared to its older peers of 2002. Under normal conditions it beat the GeForce4 Ti 4600, the previous top-end card, by 15-20%. However, when anti-aliasing (AA) and/or anisotropic filtering (AF) were enabled it would beat the Ti 4600 by anywhere from 40-100%. At the time, this was quite astonishing, and resulted in the widespread acceptance of AA and AF as critical, truly usable features.
Besides advanced architecture, reviewers also took note of ATI's change in strategy. The 9700 would be the first of ATI's chips to be shipped to third-party manufacturers instead of ATI producing all of its graphics cards (ATI would still produce cards off of its highest-end chips). This freed up engineering resources that were channelled towards driver improvements, and the 9700 performed phenomenally well at launch because of this.
The performance and quality increases offered by the R300 GPU is considered to be one of the greatest in the history of 3D graphics, alongside the achievements GeForce 256 and Voodoo Graphics. Furthermore, NVIDIA’s response in the form of the GeForce FX 5800 was both late to market and somewhat unimpressive, especially when pixel shading was used. R300 would become one of the GPUs with the longest useful lifetime in history, allowing playable performance in new games at least 3 years after its launch.[2]
[edit] Further Releases
A few months later, the 9500 and 9500 PRO were launched. The 9500 PRO had half the memory bus width of the 9700 PRO, and the 9500 "non pro" was also missing (disabled) half the pixel processing units and the hierarchical Z-buffer optimization unit (part of HyperZ III). With its full 8 pipelines and efficient architecture, the 9500 PRO outperformed all of NVIDIA’s products (save the Ti 4600). Meanwhile, the 9500 also became popular because it could in some cases be modified into the much more powerful 9700 non-PRO (np). ATI only intended for the 9500 series to be a temporary solution to fill the gap for the 2002 Christmas season, prior to the release of the 9600. Since all of the "R300" chips were based on the same physical die, ATI's margins on 9500 products were low. Radeon 9500 was one of the shortest-lived product of ATI, later replaced by the Radeon 9600 series. The logo and box package of the 9500 was "resurrected" in 2004 to market the unrelated and slower Radeon 9550 (which is a derivative of the 9600).
[edit] Refreshed
In early 2003, the 9700 cards were replaced by the 9800 (a.k.a. R350). These were R300s with higher clock speeds, and improvements to the shader units and memory controller which enhanced anti-aliasing performance. They were designed to maintain a performance lead over the newly launched GeForce FX 5900 Ultra, which it managed to do without difficulty. The 9800 still held its own against the revised 5900, primarily (and significantly) in tasks involving heavy SM2.0 pixel shading. A later version with 256 MB of memory used GDDR2. The other two variants were the 9800 "non-pro" , which was simply a lower-clocked '9800 Pro, and the 9800 SE, which had half the pixel processing units disabled (could sometimes be enabled again). Official ATI specs dictate a 256-bit memory bus for the 9800 SE, but most of the manufacturers used a 128-bit bus. Usually, the 9800 SE with 256-bit memory bus was called "9800 SE Ultra" or "9800 SE Golden Version".
Alongside the 9800, the 9600 (a.k.a. RV350) series was rolled out in early 2003, and while the 9600 PRO didn't outperform the 9500 PRO that it was supposed to replace, it was much more economical for ATI to produce by way of a 130nm process (all ATI's cards since the 7500/8500 had been 150nm) and a simplified design. Radeon 9600's "RV350" core was basically a 9800 PRO cut in half, with exactly half of the same functional units, making it a 4x1 architecture with 2 vertex shaders. It also lost part of HyperZ III with the removal of the hierarchical z-buffer optimization unit, the same as Radeon 9500. Using a 130nm process was also good for pushing up the core clock speed. The 9600 series, all with high default clocking, was shown to have quite a bit of headroom by overclockers (achieving over 500 MHz, from 400 MHz on the Pro model). While the 9600 series was less powerful than the 9500 and 9500 PRO it replaced, it did largely manage to maintain the 9500's lead over NVIDIA’s GeForce FX 5600 Ultra, and it was ATI's cost-effective answer to the long-time mainstream performance board, GeForce4 Ti 4200.
During the summer of 2003, the Mobility Radeon 9600 was launched, based upon the RV350 core. Being the first laptop chip to offer DirectX 9.0 shaders, it enjoyed the same success of the previous Mobility Radeons. The Mobility Radeon 9600 was originally planned to use a RAM technology called GDDR2-M. The company developing that memory went bankrupt and the RAM never arrived, so ATI was forced to use regular DDR SDRAM. Undoubtedly there would have been power usage savings, and perhaps performance gains with GDDR2-M. In fall 2004, a slightly faster variant, the Mobility Radeon 9700 was launched (which was still based upon the RV350, and not the older R300 of the desktop Radeon 9700 despite the naming similarity).
Later in 2003, three new cards were launched - the 9800 XT (R360), the 9600 XT (RV360), and the 9600 SE (RV350). The 9800 XT was slightly faster than the 9800 PRO had been, while the 9600 XT competed well with the newly launched GeForce FX 5700 Ultra.[3] The "RV360" chip on 9600 XT was the first graphics chip by ATI that utilized Low-K chip fabrication and allowed even higher clocking of the 9600 core (500 MHz default). The 9600 SE was ATI's answer to NVIDIA’s GeForce FX 5200 Ultra, managing to outperform the 5200 while also being cheaper. Another "RV350" board followed in early 2004, on the Radeon 9550, which was a Radeon 9600 with a lower core clock (though an identical memory clock and bus width).
Worthy of note regarding the R300-based generation is that the entire lineup utilized single-slot cooling solutions. It was not until the R420 generation's Radeon X850 XT Platinum Edition, in December 2004, that ATI would adopt an official dual-slot cooling design.[4]
[edit] New Interface
Also in 2004, ATI released the Radeon X300 and X600 boards. These were based on the "RV380" GPU which was nearly identical to the chips used in Radeon 9550 and 9600, only differing in that they were native PCI Express offerings.
[edit] Models
Desktop Graphics Boards | ||||||||||
Board Name |
Core Type |
Die Process | Clocks (MHz) Core/RAM | Core Config1 |
Fillrate2 (MTex/s) |
Geometry3 (MTri/s) |
Memory Interface |
Memory Bandwidth |
Notes | |
---|---|---|---|---|---|---|---|---|---|---|
9500 | R300 | 150 nm | 275/270 | 4:4 | 1100 | 275 | 128-bit | 8.6 GB/s | Hierarchical-Z disabled. Some cards 8p+4v moddable | |
9500 PRO | R300 | 150 nm | 275/270 | 8:4 | 2200 | 275 | 128-bit | 8.6 GB/s | Core identical to 9700. Some cards 256-bit moddable (L-shaped memory layout) | |
9700 | R300 | 150 nm | 275/270 | 8:4 | 2200 | 275 | 256-bit | 17.3 GB/s | ||
9700 PRO | R300 | 150 nm | 325/310 | 8:4 | 2600 | 325 | 256-bit | 19.8 GB/s | ||
9800 SE | R350, R360 | 150 nm | 325/270, 380/300 | 4:4 | 1300 | 325, 380 | 128-bit, 256-bit | 8.6 GB/s | Some cards unlockable to 8 pixel pipelines. 256-bit variants with the R360 core could be flashed to 9800 XTs. | |
9800 | R350 | 150 nm | 325/290 | 8:4 | 2600 | 325 | 256-bit | 18.6 GB/s | ||
9800 PRO | R350, R360 | 150 nm | 380/340 | 8:4 | 3040 | 380 | 256-bit | 21.8+ GB/s | 340 MHz 128 MB DDR or 350 MHz 256 MB GDDR2. R360 variants could be flashed to 9800 XTs. | |
9800 XL | R360 | 110 nm | 350/310 | 8:4 | ?? | ?? | 256-bit | ??GB/s | Cost reduced model based on R360 core produced for OEMs like Medion. All are 128mb. | |
9800 XXL | R360 | 110 nm | 390/337.5 | 8:4 | ?? | ?? | 256-bit | ?? GB/s | Higher clocked version of 9800XL. | |
9800 XT | R360 | 150 nm | 412/365 | 8:4 | 3296 | 412 | 256-bit | 23.4 GB/s | 256 MB GDDR2 | |
9550 SE | RV350 | 130 nm | 250/200 | 4:2 | 1000 | 125 | 64-bit | 3.2 GB/s | ||
9550 | RV350 | 130 nm | 250/200 | 4:2 | 1000 | 125 | 128-bit | 6.4 GB/s | ||
9600 SE | RV350 | 130 nm | 325/200 | 4:2 | 1300 | 163 | 64-bit | 3.2 GB/s | ||
9600 | RV350 | 130 nm | 325/200 | 4:2 | 1300 | 163 | 128-bit | 6.4 GB/s | ||
9600 PRO | RV350 | 130 nm | 400/300 | 4:2 | 1600 | 200 | 128-bit | 9.6 GB/s | ||
9600 XT | RV360 | 130 nm | 500/300 | 4:2 | 2000 | 250 | 128-bit | 9.6 GB/s | first Low-K 130 nm. | |
X300 SE | RV370 | 110 nm | 325/300 | 4:2 | 1300 | 163 | 64-bit | 4.8 GB/s | X300/X550/X600 are PCIe variants of Radeon 9600/9550 | |
X300 SE HM | RV370 | 110 nm | 325/300 | 4:2 | 1300 | 163 | 64-bit | 4.8 GB/s | HyperMemory uses system RAM and a local frame buffer; 32-128 MB. | |
X300 LE | RV370 | 110 nm | 325/200 | 4:2 | 1300 | 163 | 128-bit | 6.4 GB/s | ||
X300 | RV370 | 110 nm | 325/200 | 4:2 | 1300 | 163 | 128-bit | 6.4 GB/s | ||
X550 | RV370 | 110 nm | 400/250 | 4:2 | 1600 | 200 | 128-bit | 8.0 GB/s | ||
X600 PRO | RV380 | 130 nm | 400/300 | 4:2 | 1600 | 200 | 128-bit | 9.6 GB/s | ||
X600 XT | RV380 | 130 nm | 500/370 | 4:2 | 2000 | 250 | 128-bit | 11.8 GB/s | ||
Mobility Radeons and Integrated Graphics Processors | ||||||||||
MR9550 32 MB | M10 | 130 nm | 300/200 | 4:2 | 1200 | 150 | 64-bit | 3.2 GB/s | Mobile RV350. Powerplay power management. | |
MR9550 | M10 | 130 nm | 300/200 | 4:2 | 1200 | 150 | 128-bit | 6.4 GB/s | ||
MR9600 32 MB | M10 | 130 nm | 300/200 | 4:2 | 1200 | 150 | 64-bit | 3.2 GB/s | ||
MR9600 | M10 | 130 nm | 300/200 | 4:2 | 1200 | 150 | 128-bit | 6.4 GB/s | ||
MR9600 PRO | M10 | 130 nm | 333/200 | 4:2 | 1332 | 167 | 128-bit | 6.4 GB/s | ||
MR9600 PRO Turbo | M10 | 130 nm | 333/240 | 4:2 | 1332 | 167 | 128-bit | 7.7 GB/s | ||
MR9700 | M11 | 130 nm | 450/260 | 4:2 | 1800 | 225 | 128-bit | 8.3 GB/s | Low-K. RV360-based | |
Xpress 200 | RS480 | 130 nm | 333/NA | 2:0.5 | 666 | ? | 128-bit | NA | Based on X300. Partial vertex shader. Hypermemory-like memory set up. 0 MB - 128 MB local RAM. Uses system RAM as well. SurroundView 3-display (with separate ATI card). |
- Bold rows designate initial showings of the major core types.
- 1 (# of Pixel pipelines) : (# of vertex shaders). All chips of this generation have 1 texture mapping unit (TMU) per pixel pipeline.
- 2 MTex/s = Million Texels per second, a measure of texturing fillrate. All chips of this generation have equal texture and pixel fillrates because of having only a single TMU per pipeline.
- 3 MTri/s = Million triangles per second, a measure of the core's geometric calculation capabilities. Related to core speed and the number of vertex shaders.
[edit] References
- ^ Debevec, Paul. Rendering with Natural Light, Author's web page, 1998
- ^ Weinand, Lars. VGA Charts VII: AGP Update Summer 2005, Tom's Hardware, July 5, 2005.
- ^ Gasior, Geoff. NVIDIA's GeForce FX 5700 Ultra GPU: Third time's the charm?, The Tech Report, October 23, 2003.
- ^ Wasson, Scott. ATI's Radeon X850 XT graphics cards: Canadian double-wide?, The Tech Report, December 1, 2004.
- "3D Chip and Board Charts" by Beyond3D, retrieved January 10, 2006
- "ATI’s Radeon 9700 (R300) – Crowning the New King" by Anand Lal Shimpi, Anandtech, July 18, 2002, retrieved January 10, 2006
- "ATI Radeon 9700 PRO Review" by Dave Baumann, Beyond3D, August 19, 2002, retrieved January 10, 2006
- "Matrox's Parhelia - A Performance Paradox" by Anand Lal Shimpi, Anandtech, June 25, 2002, retrieved January 10, 2006
- "A look at the Geforce 6600GT" by the Firing Squad, retrieved November 11 [[2006]
- "Infos zur ALDI Grafikkarte Radeon 9800 XXL(in German)"- Infos zur ALDI Grafikkarte Radeon 9800 XXL, retrieved November 21, 2006
|
|