GeForce 900 series

GeForce 900 Series
Release date September 2014
Codename Maxwell
Models

GeForce Series

  • GeForce GT Series
  • GeForce GTX Series
Cards
Mid-range GeForce GTX 950M
GeForce GTX 950
GeForce GTX 960M
GeForce GTX 960
GeForce GTX 965M
High-end GeForce GTX 970M
GeForce GTX 970
GeForce GTX 980M
Enthusiast GeForce GTX 980
GeForce GTX 980 "Notebook"
GeForce GTX 980 Ti
GeForce GTX Titan X
Rendering support
Direct3D Direct3D 11.3[1] and limited Direct3D 12 (feature level 11_1 and 12_1)[1][2][3][4]
OpenCL 1.2[5]
OpenGL OpenGL 4.5
History
Predecessor GeForce 700 series

The GeForce 900 Series is a family of graphics processing units developed by Nvidia, used in desktop and laptop PCs. It serves as the high-end introduction for the Maxwell architecture (GM-codenamed chips), named after the Scottish theoretical physicist James Clerk Maxwell.

The Maxwell microarchitecture, the successor to Kepler microarchitecture, will for the first time feature an integrated ARM CPU of its own.[6] This will make Maxwell GPUs more independent from the main CPU according to Nvidia's CEO Jen-Hsun Huang.[7] Nvidia expects three major things from the Maxwell architecture: improved graphics capabilities, simplified programming as well as better energy-efficiency compared to the GeForce 700 Series and GeForce 600 Series [8]

Maxwell was announced in September 2010.[9] The first GeForce consumer-class products based on the Maxwell architecture were released in early 2014.[10] Nvidia is expected to release the Maxwell-powered Tesla accelerator cards as well as Quadro professional graphics cards based on this architecture in late 2014. Eventually, Maxwell architecture will be used for mobile application processors that belong to the Erista family of Tegra SoCs.

Architecture

First generation Maxwell (GM10x)

First generation Maxwell GM107/GM108 were released as GeForce GTX 745, GTX 750/750 Ti and GTX 850M/860M (GM107) and GT 830M/840M (GM108). These new chips provide few consumer-facing additional features; Nvidia instead focused on power efficiency. Nvidia increased the amount of L2 cache from 256 KiB on GK107 to 2 MiB on GM107, reducing the memory bandwidth needed. Accordingly, Nvidia cut the memory bus from 192 bit on GK106 to 128 bit on GM107, further saving power.[11] Nvidia also changed the streaming multiprocessor design from that of Kepler (SMX), naming it SMM. The structure of the warp scheduler is inherited from Kepler, which allows each scheduler to issue up to two instructions that are independent from each other and are in order from the same warp. The layout of SMM units is partitioned so that each of the 4 warp schedulers in an SMM controls 1 set of 32 FP32 CUDA cores, 1 set of 8 load/store units, and 1 set of 8 special function units. This is in contrast to Kepler, where each SMX has 4 schedulers that schedule to a shared pool of 6 sets of 32 FP32 CUDA cores, 2 sets of 16 load/store units, and 2 sets of 16 special function units.[12] These units are connected by a crossbar that uses power to allow the resources to be shared.[12] This crossbar is removed in Maxwell.[12] Texture units and FP64 CUDA cores are still shared.[11] SMM allows for a finer-grain allocation of resources than SMX, saving power when the workload isn't optimal for shared resources. Nvidia claims a 128 CUDA core SMM has 86% of the performance of a 192 CUDA core SMX.[11] Also, each Graphics Processing Cluster, or GPC, contains up to 4 SMX units in Kepler, and up to 5 SMM units in first generation Maxwell.[11]

GM107 supports CUDA Compute Capability 5.0 compared to 3.5 on GK110/GK208 GPUs and 3.0 on GK10x GPUs. Dynamic Parallelism and HyperQ, two features in GK110/GK208 GPUs, are also supported across the entire Maxwell product line.

Maxwell provides native shared memory atomic operations for 32-bit integers and native shared memory 32-bit and 64-bit compare-and-swap (CAS), which can be used to implement other atomic functions.

NVENC

Main article: Nvidia NVENC

Maxwell-based GPUs also contain the NVENC SIP block introduced with Kepler. Nvidia's video encoder, NVENC, is 1.5 to 2 times faster than on Kepler-based GPUs meaning it can encode video at 6 to 8 times playback speed.[11]

PureVideo

Main article: Nvidia PureVideo

Nvidia also claims an 8 to 10 times performance increase in PureVideo Feature Set E video decoding due to the video decoder cache paired with increases in memory efficiency. However, H.265 is not supported for full hardware decoding, relying on a mix of hardware and software decoding.[11] When decoding video, a new low power state "GC5" is used on Maxwell GPUs to conserve power.[11]

Second generation Maxwell (GM20x)

Second generation Maxwell introduced several new technologies: Dynamic Super Resolution,[13] Third Generation Delta Color Compression,[14] Multi-Pixel Programming Sampling,[15] Nvidia VXGI (Real-Time-Voxel-Global Illumination),[16] VR Direct,[17][18][19] Multi-Projection Acceleration,[14] and Multi-Frame Sampled Anti-Aliasing (MFAA)[20] however support for Coverage-Sampling Anti-Aliasing (CSAA) was removed.[21] HDMI 2.0 support was also added.[22][23]

Second generation Maxwell also changed the ROP to memory controller ratio from 8:1 to 16:1.[24] However, some of the ROPs are generally idle in the GTX 970 because there are not enough enabled SMMs to give them work to do and therefore reduces its maximum fill rate.[25]

Second generation Maxwell also has up to 4 SMM units per GPC, compared to 5 SMM units per GPC.[24]

GM204 supports CUDA Compute Capability 5.2 compared to 5.0 on GM107/GM108 GPUs, 3.5 on GK110/GK208 GPUs and 3.0 on GK10x GPUs.[14][24][26]

Maxwell second generation GM20x GPUs have an upgraded NVENC which supports HEVC encoding and adds support for H.264 encoding resolutions at 1440p/60FPS & 4K/60FPS compared to NVENC on Maxwell first generation GM10x GPUs which only supported H.264 1080p/60FPS encoding.[19]

Maxwell GM206 GPU supports full fixed function HEVC hardware decoding.[27][28]

False advertisement

GeForce GTX 970 specifications controversy

Issues with the GeForce GTX 970's performance were first brought up by users when they found out that the cards, while featuring 4 GB of memory, rarely accessed memory over the 3.5 GB boundary. Further testing and investigation eventually led to Nvidia issuing a statement that the card's initially announced specifications had been altered without notice before the card was made commercially available, and that the card took a performance hit once memory over the 3.5 GB limit were put into use.[29][30][31]

The card's back-end hardware specifications, initially announced as being identical to those of the GeForce GTX 980, differed in the amount of L2 cache (1.75 MB versus 2 MB in the GeForce GTX 980) and the amount of ROPs (56 versus 64 in the 980). Additionally, it was revealed that the card was designed to access its memory as a 3.5 GB section, plus a 0.5 GB one, access to the latter being 7 times slower than the first one.[32] The company then went on to promise a specific driver modification in order to alleviate the performance issues produced by the cutbacks suffered by the card.[33] However, Nvidia later clarified that the promise had been a miscommunication and there would be no specific driver update for the GTX 970.[34] Nvidia claimed that it would assist customers who wanted refunds in obtaining them.[35] On February 26, 2015, Nvidia CEO Jen-Hsun Huang went on record in Nvidia's official blog to apologize for the incident.[36] In February 2015 a class-action lawsuit alleging false advertising was filed against Nvidia and Gigabyte Technology in the U.S. District Court for Northern California.[37][38]

Nvidia revealed that it is able to disable individual units, each containing 256KB of L2 cache and 8 ROPs, without disabling whole memory controllers.[39] This comes at the cost of dividing the memory bus into high speed and low speed segments that cannot be accessed at the same time unless one segment is reading while the other segment is writing because the L2/ROP unit managing both of the GDDR5 controllers shares the read return channel and the write data bus between the two GDDR5 controllers and itself.[39] This is used in the GeForce GTX 970, which therefore can be described as having 3.5 GB in its high speed segment on a 224-bit bus and 0.5 GB in a low speed segment on a 32-bit bus.[39]

Limited DirectX 12 support

While the Maxwell series was marketed as fully DirectX 12 compliant,[2][40][41] Oxide Games, developer of Ashes of the Singularity, uncovered that the hardware of Maxwell cards is not fully DirectX 12 compatible.[42][43]

Using the new asynchronous compute and shader pipeline is not possible with Maxwell GPUs, despite Nvidia specifically advertising the feature for the 900 series.[40] Instead, Nvidia partially implemented this core feature through a driver-based shim, coming at a high performance cost.[43] AMD's GCN-based graphics cards include hardware-based asynchronous compute,[44] giving them the edge in certain DirectX 12 benchmarks and games.[43][45][46]

Oxide claims to have been pressured by Nvidia not to include the asynchronous compute feature in their benchmark at all, so that the 900 series would not be at a disadvantage when competing against AMD's DirectX 12 compliant GCN architecture.[42]

On August 4, 2015, Oxide and Nvidia confirmed that Maxwell cards "[haven’t] fully implemented [async compute] yet, but [drivers made it] appeared like it was".[47] Nvidia is working together with Oxide on a full async compute implementation. Unlike AMD's full hardware implementation of the asynchronous pipeline, Nvidia will rely on the driver to implement a software queue and a software distributor to forward asyncronous tasks to the hardware schedulers, capable of distributing the workload to the correct units.[48]

Products

GeForce 900 (9xx) series

Model Launch Code name Fab (nm) Transistors (million) Die size (mm2) Bus interface Core config1 Clock speeds Fillrate Memory API support (version) Processing power (GFLOPS) TDP (watts) SLI support6 Release price (USD)
Base core clock (MHz) Boost core clock (MHz) Memory (MT/s) Pixel (GP/s)2 Texture (GT/s)3 Size (MiB)7 Bandwidth (GB/s)7 Bus type Bus width (bit)7 DirectX OpenGL OpenCL Single precision4 Double precision5
GeForce GTX 950 [51] August 20, 2015 GM206 28 2940 227 PCIe 3.0 x16 768:48:32 1024 1188 6610 32.7 49.2 2048
4096
106 GDDR5 128 12.0[1][4] 4.5 1.2 1572 49.1 90 2-way $159
GeForce GTX 960 [52] January 22, 2015 GM206 28 2940 227 PCIe 3.0 x16 1024:64:32 1127 1178 7010 39.3 72.1 2048
4096
112 GDDR5 128 12.0[1][4] 4.5 1.2 2308 72.1 120 2-way $199
GeForce GTX 970 [53] September 18, 2014 GM204 28 5200 398 PCIe 3.0 x16 1664:104:56[54] 1050 1178 7010 54.6 109.2 3584+512[55] 196+28[56] GDDR5 224+32[39] 12.0[1][4] 4.5 1.2[57] 3494 109 145 3-way $329
GeForce GTX 980 [41] September 18, 2014 GM204 28 5200 398 PCIe 3.0 x16 2048:128:64 1126 1216 7010 72.1 144 4096 224 GDDR5 256 12.0[1][4] 4.5 1.2[5] 4612 144 165 4-way $549
GeForce GTX 980 Ti[58] June 2, 2015 GM200 28 8000 601 PCIe 3.0 x16 2816:176:96 1000 1076 7010 96 176 6144 336 GDDR5 384 12.0 4.5 1.2 5632 176 250 4-way $649
GeForce GTX Titan X [59][60][61] March 17, 2015 GM200 28 8000 601 PCIe 3.0 x16 3072:192:96 1000 1089 7010 96 192 12288 336 GDDR5 384 12.0 4.5 1.2 6144 192 250 4-way[62] $999

GeForce 900M (9xxM) series

Some implementations may use different specifications.

Model Launch Code name Fab (nm) Transistors (million) Die size (mm2) Bus interface Core config1 Clock speeds Fillrate Memory API support (version) Processing power (GFLOPS) TDP (watts) SLI support6
Base core clock (MHz) Boost core clock (MHz) Memory (MT/s) Pixel (GP/s)2 Texture (GT/s)3 Size (MiB) Bandwidth (GB/s) Bus type Bus width (bit) DirectX OpenGL OpenCL Single precision4 Double precision5
GeForce 920M [63][64][65] March 13, 2015 GF117 28 585 116 PCIe 3.0 x8 384:16:8 775 1550 1800 3.1 12.4 1024 2048 14.4 DDR3 64 12.0 4.5 1.1 595.2 Unknown 33 No
GK208 Unknown 87 575 575 4.6 9.2 12.0 1.2 441.6 18.4
GeForce 930M [66][67] March 13, 2015 GM108 28 Unknown Unknown PCIe 3.0 x8 384:16:8 1029 1124 2002 8.23 16.5 1024 2048 16 DDR3 64 12.0 4.5 1.2 790.3 24.7 33 No
GeForce 940M [68][69][70] March 13, 2015 GM107 28 1870 148 PCIe 3.0 x16 640:40:16 1029 1100 2002 16.5 41.2 2048 16 - 80.2 GDDR5 DDR3 128 12.0 4.5 1.2 1317 41.1 75 No
GM108 Unknown Unknown PCIe 3.0 x8 384:16:8 8.2 16.5 64 790.3 24.7 33
GeForce GTX 950M [71][72] March 13, 2015 GM107 28 1870 148 PCIe 3.0 x16 640:40:16 914 1085 5012 14.6 36.6 2048 80 GDDR5 128 12.0[1][4] 4.5 1.2[5] 1170 36.56 75 No
GeForce GTX 960M [73][74] March 13, 2015 GM107 28 1870 148 PCIe 3.0 x16 640:40:16 1029 1085 5012 16.5 41.2 2048 80 GDDR5 128 12.0[1][4] 4.5 1.2[5] 1317 41.16 65 No
GeForce GTX 965M [75][76] January 5, 2015 GM204 28 5200 398 PCIe 3.0 x16 1024:64:32 924 950 5000 30.2 60.4 2048 80 GDDR5 128 12.0[1][4] 4.5 1.2[5] 1945 60.78 60 [77] Yes
GeForce GTX 970M [78] October 7, 2014 GM204 28 5200 398 PCIe 3.0 x16 1280:80:48 924 993 5012 37.0 73.9 3072
6144
120 GDDR5 192[79] 12.0[1][4] 4.5 1.2[5] 2365 73.9 75 Yes
GeForce GTX 980M [80] October 7, 2014 GM204 28 5200 398 PCIe 3.0 x16 1536:96:64 1038 1127 5012 49.8 99.6 4096
8192
160 GDDR5 256[79] 12.0[1][4] 4.5 1.2[5] 3189 99.6 100 Yes
GeForce GTX 980 (Notebook) [81] September 22, 2015 GM204 28 5200 398 PCIe 3.0 x16 2048:128:64 1064 1216 7010 72.1 144 4096
8192
224 GDDR5 256 12.0[1][4] 4.5 1.2[5] 4612 144 145 Yes

Chipset table

Future

After Maxwell, the next architecture is code-named Pascal.[82][83] Nvidia has announced that the Pascal GPU will feature stacked DRAM, unified memory, and NVLink,[82][83] and is expected to be released in 2016.

See also

References

  1. 1 2 3 4 5 6 7 8 9 10 11 12 Ryan Smith. "Maxwell 2’s New Features: Direct3D 11.3 & VXGI - The NVIDIA GeForce GTX 980 Review: Maxwell Mark 2". anandtech.com.
  2. 1 2 "Maxwell and DirectX 12 Delivered". The Official NVIDIA Blog.
  3. "MSDN Blogs". msdn.com. Microsoft.
  4. 1 2 3 4 5 6 7 8 9 10 11 Ryan Smith. "Microsoft Details Direct3D 11.3 & 12 New Rendering Features". anandtech.com.
  5. 1 2 3 4 5 6 7 8 "NVIDIA GeForce GTX 980". TechPowerUp.
  6. Nvidia Maxwell to be first GPU with ARM CPU in 2013, Guru3d.com
  7. "Nvidia Maxwell Graphics Processors to Have Integrated ARM General-Purpose Cores - X-bit labs". xbitlabs.com.
  8. "Nvidia: Next-Generation Maxwell Architecture Will Break New Grounds - X-bit labs". xbitlabs.com.
  9. Ryan Smith. "GTC 2010 Day 1: NVIDIA Announces Future GPU Families for 2011 And 2013". anandtech.com.
  10. "GeForce GTX 750 Class GPUs: Serious Gaming, Incredible Value". geforce.com.
  11. 1 2 3 4 5 6 7 Smith, Ryan; T S, Ganesh (February 18, 2014). "The NVIDIA GeForce GTX 750 Ti and GTX 750 Review: Maxwell Makes Its Move". AnandTech. Archived from the original on February 18, 2014. Retrieved February 18, 2014.
  12. 1 2 3 Ryan Smith, Ganesh T S. "Maxwell: Designed For Energy Efficiency - The NVIDIA GeForce GTX 750 Ti and GTX 750 Review: Maxwell Makes Its Move". anandtech.com.
  13. "Dynamic Super Resolution Improves Your Games With 4K-Quality Graphics On HD Monitors". geforce.com.
  14. 1 2 3 http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF
  15. "NVIDIA - Maintenance". geforce.com.
  16. "Maxwell’s Voxel Global Illumination Technology Introduces Gamers To The Next Generation Of Graphics". geforce.com.
  17. "NVIDIA Maxwell GPUs: The Best Graphics Cards For Virtual Reality Gaming". geforce.com.
  18. "How Maxwell’s VR Direct Brings Virtual Reality Gaming Closer to Reality". The Official NVIDIA Blog.
  19. 1 2 Ryan Smith. "Display Matters: HDMI 2.0, HEVC, & VR Direct - The NVIDIA GeForce GTX 980 Review: Maxwell Mark 2". anandtech.com.
  20. "Multi-Frame Sampled Anti-Aliasing Delivers Better Performance To Maxwell Gamers". geforce.com.
  21. "New nVidia Maxwell chips do not support fast CSAA". realhardwarereviews.com.
  22. "Introducing The Amazing New GeForce GTX 980 & 970". geforce.com.
  23. Ryan Smith. "The NVIDIA GeForce GTX 980 Review: Maxwell Mark 2". anandtech.com.
  24. 1 2 3 Ryan Smith. "Maxwell 2 Architecture: Introducing GM204 - The NVIDIA GeForce GTX 980 Review: Maxwell Mark 2". anandtech.com.
  25. 1 2 "Here's another reason the GeForce GTX 970 is slower than the GTX 980". techreport.com.
  26. "Maxwell: The Most Advanced CUDA GPU Ever Made". Parallel Forall.
  27. Ryan Smith. "NVIDIA Launches GeForce GTX 960". anandtech.com.
  28. Ryan Smith. "NVIDIA Launches GeForce GTX 950; GM206 The Lesser For $159". anandtech.com.
  29. "NVIDIA Discloses Full Memory Structure and Limitations of GTX 970". PCPer.
  30. "GeForce GTX 970 Memory Issue Fully Explained – Nvidia’s Response". WCFTech.
  31. "Why Nvidia's GTX 970 slows down when using more than 3.5GB VRAM". PCGamer.
  32. "GeForce GTX 970: Correcting The Specs & Exploring Memory Allocation". AnandTech.
  33. "NVIDIA Working on New Driver For GeForce GTX 970 To Tune Memory Allocation Problems and Improve Performance". WCFTech.
  34. "NVIDIA clarifies no driver update for GTX 970 specifically". PC World.
  35. "NVIDIA Plans Driver Update for GTX 970 Memory Issue, Help with Returns". pcper.com.
  36. "Nvidia CEO addresses GTX 970 controversy". PCGamer. February 26, 2015.
  37. Chalk, Andy (February 22, 2015). "Nvidia faces false advertising lawsuit over GTX 970 specs". PC Gamer. Retrieved March 27, 2015.
  38. Niccolai, James (February 20, 2015). "Nvidia hit with false advertising suit over GTX 970 performance". PC World. Retrieved March 27, 2015.
  39. 1 2 3 4 Ryan Smith. "Diving Deeper: The Maxwell 2 Memory Crossbar & ROP Partitions - GeForce GTX 970: Correcting The Specs & Exploring Memory Allocation". anandtech.com.
  40. 1 2 http://international.download.nvidia.com/geforce-com/international/images/nvidia-geforce-gtx-980-ti/nvidia-geforce-gtx-980-ti-directx-12-advanced-api-support.png
  41. 1 2 "GeForce GTX 980 - Specifications - GeForce". geforce.com.
  42. 1 2 Hilbert Hagedoorn. "Nvidia Wanted Oxide dev DX12 benchmark to disable certain DX12 Features ? (content updated)". Guru3D.com.
  43. 1 2 3 "[Various] Ashes of the Singularity DX12 Benchmarks". Overclock.net. August 17, 2015.
  44. Hilbert Hagedoorn. "AMD Radeon R9 Fury X review". Guru3D.com.
  45. "Lack of Async Compute on Maxwell Makes AMD GCN Better Prepared for DirectX 12". TechPowerUp.
  46. "DX12 GPU and CPU Performance Tested: Ashes of the Singularity Benchmark". pcper.com.
  47. "[Various] Ashes of the Singularity DX12 Benchmarks". Overclock.net. August 17, 2015.
  48. "[Various] Ashes of the Singularity DX12 Benchmarks". Overclock.net. August 17, 2015.
  49. Smith, Ryan (September 18, 2014). "The NVIDIA GeForce GTX 980 Review: Maxwell Mark 2". AnandTech. p. 1. Retrieved September 19, 2014.
  50. Ryan Smith. "The NVIDIA GeForce GTX Titan X Review". anandtech.com.
  51. "GeForce GTX 950 - Specifications - GeForce". geforce.com.
  52. "GeForce GTX 960 - Specifications - GeForce". geforce.com.
  53. "GeForce GTX 970 - Specifications - GeForce". geforce.com.
  54. Ryan Smith. "GeForce GTX 970: Correcting The Specs & Exploring Memory Allocation". anandtech.com.
  55. "NVIDIA Responds to GTX 970 3.5GB Memory Issue". pcper.com.
  56. Ryan Smith. "Practical Performance Possibilities & Closing Thoughts - GeForce GTX 970: Correcting The Specs & Exploring Memory Allocation". anandtech.com.
  57. "NVIDIA GeForce GTX 970". TechPowerUp.
  58. "GeForce GTX 980 Ti - Specifications - GeForce". geforce.com.
  59. "GeForce GTX TITAN X - Specifications - GeForce". geforce.com.
  60. "NVIDIA TITAN X GPU Powers "Thief in the Shadows" VR Experience - NVIDIA Blog". The Official NVIDIA Blog.
  61. "NVIDIA GeForce GTX TITAN X". TechPowerUp.
  62. "Hands On With The NVIDIA GeForce GTX TITAN X 12GB Video Card - Legit Reviews". Legit Reviews.
  63. "GeForce 920M - Specifications - GeForce". geforce.com.
  64. "NVIDIA GeForce 920M". TechPowerUp.
  65. "NVIDIA GeForce 920M". TechPowerUp.
  66. "GeForce 930M - Specifications - GeForce". geforce.com.
  67. "NVIDIA GeForce 930M". TechPowerUp.
  68. "GeForce 940M - Specifications - GeForce". geforce.com.
  69. "NVIDIA GeForce 940M". TechPowerUp.
  70. "NVIDIA GeForce 940M". TechPowerUp.
  71. "GeForce GTX 950M - Specifications - GeForce". geforce.com.
  72. "NVIDIA GeForce GTX 950M". TechPowerUp.
  73. "GeForce GTX 960M - Specifications - GeForce". geforce.com.
  74. "NVIDIA GeForce GTX 960M". TechPowerUp.
  75. "GeForce GTX 965M - Specifications - GeForce". geforce.com.
  76. "NVIDIA GeForce GTX 965M". TechPowerUp.
  77. "Eurocom Configure Model". eurocom.com.
  78. "GeForce GTX 970M - Specifications - GeForce". geforce.com.
  79. 1 2 "GTX 970: 3.5 Go et 224-bit au lieu de 4 Go et 256-bit ?". hardware.fr.
  80. "GeForce GTX 980M - Specifications - GeForce". geforce.com.
  81. "GeForce GTX 980 (Notebook - Specifications - GeForce". geforce.com.
  82. 1 2 "NVIDIA Updates GPU Roadmap; Announces Pascal". The Official NVIDIA Blog.
  83. 1 2 "NVIDIA Pascal GPU Architecture to Provide 10X Speedup for Deep Learning Apps - NVIDIA Blog". The Official NVIDIA Blog.

External links

This article is issued from Wikipedia - version of the Thursday, February 11, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.