3DNow!

3DNow! is an extension to the x86 instruction set developed by Advanced Micro Devices (AMD). It adds single instruction multiple data (SIMD) instructions to the base x86 instruction set, enabling it to perform simple vector processing, which improves the performance of many graphic-intensive applications. The first microprocessor to implement 3DNow was the AMD K6-2, which was introduced in 1998. When the application was appropriate this raised the speed by about 2-4 times.[1]

However, the instruction set never gained much popularity, and AMD announced on August 2010 that support for 3DNow would be dropped in future AMD processors, except for two instructions (the PREFETCH and PREFETCHW instructions).[2]

History

3DNow was developed at a time when 3D graphics were becoming mainstream in PC multimedia and gaming software. Realtime display of 3D graphics depended heavily on the host CPU's floating-point unit (FPU) to perform floating-point calculations, a task in which AMD's K6 processor was easily outperformed by its competitor, the Intel Pentium II.

As an enhancement to the MMX instruction set, the 3DNow instruction-set augmented the MMX SIMD registers to support common arithmetic operations (add/subtract/multiply) on single-precision (32-bit) floating-point data. Software written to use AMD's 3DNow instead of the slower x87 FPU could execute up to 4x faster, depending on the instruction-mix.

Versions

3DNow

The first implementation of 3DNow technology contains 21 new instructions that support SIMD floating-point operations. The 3DNow data format is packed, single-precision, floating-point. The 3DNow instruction set also includes operations for SIMD integer operations, data prefetch, and faster MMX-to-floating-point switching. Later, Intel would add similar (but incompatible) instructions to the Pentium III, known as SSE (Streaming SIMD Extensions).

3DNow floating-point instructions

  • PI2FD  Packed 32-bit integer to floating-point conversion
  • PF2ID  Packed floating-point to 32-bit integer conversion
  • PFCMPGE  Packed floating-point comparison, greater or equal
  • PFCMPGT  Packed floating-point comparison, greater
  • PFCMPEQ  Packed floating-point comparison, equal
  • PFACC  Packed floating-point accumulate
  • PFADD  Packed floating-point addition
  • PFSUB  Packed floating-point subtraction
  • PFSUBR  Packed floating-point reverse subtraction
  • PFMIN  Packed floating-point minimum
  • PFMAX  Packed floating-point maximum
  • PFMUL  Packed floating-point multiplication
  • PFRCP  Packed floating-point reciprocal approximation
  • PFRSQRT  Packed floating-point reciprocal square root approximation
  • PFRCPIT1  Packed floating-point reciprocal, first iteration step
  • PFRSQIT1  Packed floating-point reciprocal square root, first iteration step
  • PFRCPIT2  Packed floating-point reciprocal/reciprocal square root, second iteration step

3DNow integer instructions

3DNow performance-enhancement instructions

3DNow extensions

There is little or no evidence that the second version of 3DNow was ever officially given its own trade name. This has led to some confusion in documentation that refers to this new instruction set. The most common terms are Extended 3DNow, Enhanced 3DNow and 3DNow+. The phrase "Enhanced 3DNow" can be found in a few locations on the AMD website but the capitalization of "Enhanced" appears to be either purely grammatical or used for emphasis on processors that may or may not have these extensions (the most notable of which references a benchmark page for the K6-III-P that does not have these extensions).[3][4]

This extension to the 3DNow instruction set was introduced with the first-generation Athlon processors. The Athlon added 5 new 3DNow instructions and 19 new MMX instructions. Later, the K6-2+ and K6-III+ (both targeted at the mobile market) included the 5 new 3DNow instructions, leaving out the 19 new MMX instructions. The new 3DNow instructions were added to boost DSP. The new MMX instructions were added to boost streaming media.

3DNow or MMX extensions? The 19 new MMX instructions are a subset of Intel's SSE1 instruction set. In AMD technical manuals, AMD segregates these instructions apart from the 3DNow extensions.[3] In AMD customer product literature, however, this segregation is less clear where the benefits of all 24 new instructions are credited to enhanced 3DNow technology.[5] This has led programmers to come up with their own name for the 19 new MMX instructions. The most common appears to be Integer SSE (ISSE).[6] SSEMMX and MMX2 are also found in video filter documentation from the public domain sector. [It should also be noted that ISSE could also refer to Internet SSE, an early name for SSE.]

3DNow extension DSP instructions

MMX extension instructions (Integer SSE)

  • MASKMOVQ  Streaming (cache bypass) store using byte mask
  • MOVNTQ  Streaming (cache bypass) store
  • PAVGB  Packed average of unsigned byte
  • PAVGW  Packed average of unsigned word
  • PMAXSW  Packed maximum signed word
  • PMAXUB  Packed maximum unsigned byte
  • PMINSW  Packed minimum signed word
  • PMINUB  Packed minimum unsigned byte
  • PMULHUW  Packed multiply high unsigned word
  • PSADBW  Packed sum of absolute byte differences
  • PSHUFW  Packed shuffle word
  • PEXTRW  Extract word into integer register
  • PINSRW  Insert word from integer register
  • PMOVMSKB  Move byte mask to integer register
  • PREFETCHNTA  Prefetch using the NTA reference
  • PREFETCHT0  Prefetch using the T0 reference
  • PREFETCHT1  Prefetch using the T1 reference
  • PREFETCHT2  Prefetch using the T2 reference
  • SFENCE  Store fence

3DNow Professional

3DNow Professional is a trade name used to indicate processors that combine 3DNow technology with a complete SSE instructions set (such as SSE1, SSE2 or SSE3).[7] The Athlon XP was the first processor to carry the 3DNow Professional trade name, and was the first product in the Athlon family to support the complete SSE1 instruction set (for the total of: 21 original 3DNow instructions; five 3DNow extension DSP instructions; 19 MMX extension instructions; and 52 additional SSE instructions for complete SSE1 compatibility).[8]

3DNow and the Geode GX/LX

The Geode GX and Geode LX added two new 3DNow instructions which are currently absent in all the other processors.

3DNow Professional instructions unique to the Geode GX/LX

Advantages and disadvantages

One advantage of 3DNow is that it is possible to add or multiply the two numbers that are stored in the same register. With SSE, each number can only be combined with a number in the same position in another register. This capability, known as horizontal in Intel terminology, was the major addition to the SSE3 instruction set.

A disadvantage with 3DNow is that 3DNow instructions and MMX instructions share the same register-file, whereas SSE adds 8 new independent registers (XMM0XMM7).

Because MMX/3DNow registers are shared by the standard x87 FPU, 3DNow instructions and x87 instructions cannot be executed simultaneously. However, because it is aliased to the x87 FPU, the 3DNow and MMX register states can be saved and restored by the traditional x87 F(N)SAVE and F(N)RSTOR instructions. This arrangement allowed operating systems to support 3DNow with no explicit modifications, whereas SSE registers required explicit operating system support to properly save and restore the new XMM registers (via the added FXSAVE and FXRSTOR instructions.)

The FX* instructions are an upgrade to the older x87 save and restore instructions because these could save not only SSE register states but also those x87 register states (hence which meant that it could save MMX and 3DNow registers too).

On AMD Athlon XP and K8-based cores (i.e. Athlon 64), assembly programmers have noted that it is possible to combine 3DNow and SSE instructions to reduce register pressure, but in practice it is difficult to improve performance due to the instructions executing on shared functional units.[9]

Processors supporting 3DNow

References

  1. "Effectively Utilizing 3DNow in Linux". Linux Journal. Dec 1, 1999. Retrieved 2010-10-03.
  2. "3DNow Instructions are Being Deprecated | AMD Developer Central". Blogs.amd.com. 2010-08-18. Archived from the original on 24 October 2010. Retrieved 2010-10-03.
  3. 3.0 3.1 "AMD Extensions to the 3DNow and MMX Instruction Sets Manual" (PDF). Advanced Micro Devices, Inc. March 2000. Retrieved 2008-06-07.
  4. "Mobile AMD-K6-III-P Processor-Based Notebook: Ziff-Davis CPUmark 99". Retrieved 2008-06-07. Incorrect title on page: Mobile AMD-K6-III+ and Mobile AMD-K6-2+ Processors with Enchanced [sic] 3DNow! Technology
  5. "AMD Athlon Processor Product Brief". Advanced Micro Devices, Inc. Retrieved 2008-06-08.
  6. "ISSE". AviSynth. Retrieved 2008-06-08.
  7. "Explaining the new 3DNow Professional Technology". Advanced Micro Devices, Inc. Retrieved 2008-06-08.
  8. "AMD Athlon XP Architectural Features". Advanced Micro Devices, Inc. Retrieved 2008-06-08.
  9. "3DNow+ vs SSE on Athlon XP - comp.sys.ibm.pc.hardware.chips | Google Groups". Groups.google.com. Retrieved 2010-10-03.

Further reading

External links