PCI Express

From Wikipedia, the free encyclopedia

PCI Express

PCI Express logo
Year Created: 2004
Created By: Intel

Number of Devices: 1 per slot
Style: Serial
Hotplugging? Yes
External? No

PCI Express, abbreviated officially with PCIe (also PCI-E is often used) and which should not to be mistaken for PCI-X, is an implementation of the PCI connection standard that uses existing PCI programming concepts, but bases it on a completely different and much faster full duplex, multi-lane, point to point serial physical-layer communications protocol. PCI Express was formerly known as Arapaho or 3GIO for 3rd Generation I/O.

PCIe transfers data at 250 MB/s per lane. With a maximum of 32 lanes, PCIe allows for a total combined transfer rate of 8 GB/s. To put these figures into perspective, a single lane has nearly twice the data rate of normal PCI, a four lane slot has a comparable data rate to the fastest version of PCI-X, and an eight lane slot has a data rate comparable to the fastest version of AGP. The full duplex point to point nature of PCIe should further improve its advantage over PCI, particularly in systems with many devices.

Contents

[edit] Overview

The PCIe physical layer consists of a network of serial interconnects much like twisted pair ethernet. A single hub with many pins on the mainboard is used, allowing extensive switching and parallelism. This design was chosen because as clock rates increase, synchronization of parallel connections is hindered by timing skew. PCIe is just one example of a general trend away from parallel buses to serial interconnects. For other examples, see HyperTransport, Serial ATA, USB, SAS or FireWire.

PCIe is supported primarily by Intel, which started working on the standard as the Arapahoe project after pulling out of the InfiniBand system.

PCIe is intended to be used as a local interconnect only. As it is based on the existing PCI system, cards and systems can be converted to PCI Express by changing the physical layer only — existing systems could be adapted to PCI Express without any change in software. The increased bandwidth on PCI Express allows it to replace almost all existing internal buses, including AGP and PCI, and Intel envisions a single PCI Express controller talking to all external devices, as opposed to the northbridge/southbridge solution in current machines.

[edit] Hardware protocol summary

The PCIe link is built around dedicated unidirectional couples of serial (1-bit), point-to-point connection known as a "lane". This is in sharp contrast to the PCI connection, which is a bus-based system where all the devices share the same bidirectional, 32-bit (or 64-bit), parallel bus.

PCI Express is a layered protocol, consisting of a Transaction Layer, a Data Link Layer, and a Physical Layer. The Physical Layer is further divided into a logical sublayer and an electrical sublayer. The logical sublayer is frequently further divided into a Physical Coding Sublayer (PCS) and a Media Access Control (MAC) sublayer (terms borrowed from the IEEE 802 model of networking protocol).

[edit] Physical Layer

At the electrical level, each lane utilizes two unidirectional low voltage differential signaling (LVDS) pairs at 2.5 Gbit/s. Transmit and receive are separate differential pairs, for a total of 4 data wires per lane.

PCI Express slots (from top to bottom: x4, x16, x1 and x16), compared to a traditional 32-bit PCI slot (bottom), as seen on DFI's LanParty nF4 Ultra-D
Enlarge
PCI Express slots (from top to bottom: x4, x16, x1 and x16), compared to a traditional 32-bit PCI slot (bottom), as seen on DFI's LanParty nF4 Ultra-D
An XFX brand nVidia GeForce 6600GT PCI-Express video adapter card
Enlarge
An XFX brand nVidia GeForce 6600GT PCI-Express video adapter card

A connection between any two PCIe devices is known as a "link", and is built up from a collection of 1 or more lanes. All devices must minimally support single-lane (x1) link. Devices may optionally support wider links composed of 2, 4, 8, 12, 16, or 32 lanes. This allows for very good compatibility in two ways. A PCIe card will physically fit (and work correctly) in any slot that is at least as large as it is (e.g. an x1 card will work in an x4 or x16 slot), and a slot of a large physical size (e.g. x16) can be wired electrically with fewer lanes (e.g. x1 or x8; however, it must still provide the power and ground connections required by the larger physical slot size). In both cases, the PCIe link will negotiate the highest mutually supported number of lanes. It is not, however, possible for a device to operate in a slot that is physically smaller than it (e.g. an x4 card can't fit in a slot which is physically an x1 slot — though it could operate in a x4 slot wired with only 1 lane).

PCIe sends all control messages, including interrupts, over the same links used for data. The serial protocol can never be blocked, so latency comparable to PCI (which has dedicated interrupt lines) can be maintained.

Data transmitted on multiple-lane links is interleaved, meaning that each successive byte is sent down successive lanes. The PCIe specification refers to this interleaving as "data striping." While requiring significant hardware complexity to synchronize (or deskew) the incoming striped data, striping can significantly increase the throughput of the link. Due to padding requirements, striping may not necessarily reduce the latency of small data packets on a link.

As with all high data rate serial transmission protocols, clocking information must be embedded in the signal. At the physical level, PCI Express utilizes the very common 8B/10B encoding scheme to ensure that strings of consecutive ones or consecutive zeros are limited in length, so that the receiver does not lose track of where the bit edges are. This coding scheme replaces 8 uncoded (payload) bits of data with 10 (encoded) bits of transmitted data, consuming 20% of the overall electrical bandwidth.

Some other protocols (such as SONET) use a different form of encoding known as "scrambling" to embed clock information into data streams. The PCI Express specification also defines a scrambling algorithm, but its form of scrambling is not to be confused with the scrambling included in SONET. Rather than embedding clock information, the scrambling in PCI Express is designed to prevent repeating data patterns in the transmitted data stream from causing RF emission peaks.

First-generation PCIe is constrained to a single signalling rate of 2.5 gigabits/s. The PCI Special Interest Group (the industry organization that maintains and develops the various PCI standards) plans future versions adding signalling rates of 5 and 10 gigabit/s.

[edit] Data Link Layer

The Data Link Layer implements sequencing of Transaction Layer Packets (TLPs) that are generated by the Transaction Layer, data protection via a 32-bit cyclic redundancy check code (CRC, known in this context as LCRC), and an acknowledgement protocol (ACK and NAK signaling). TLPs that pass an LCRC check and a sequence number check result in an acknowledgement, or ACK, while those that fail these checks result in a negative acknowledgement, or NAK. TLPs that result in a NAK, or timeouts that occur while waiting for an ACK, result in the TLPs being replayed from a special buffer in the transmit data path of the Data Link Layer. This guarantees delivery of TLPs in spite of electrical noise, barring any malfunction of the device or transmission medium.

ACK and NAK signals are communicated via a low-level packet known as a data link layer packet, or DLLP. DLLPs are also used to communicate flow control information between the transaction layers of two connected devices, as well as some power management functions.

[edit] Transaction Layer

PCI Express implements split transactions (transactions with request and response separated by time), allowing the link to carry other traffic while the target device gathers data for the response

PCI Express utilizes credit-based flow control. In this scheme, a device advertises an initial amount of credit for each of the receive buffers in its Transaction Layer. The device at the opposite end of the link, when sending transactions to this device, will count the number of credits consumed by each TLP from its account. The sending device may only transmit a TLP when doing so does not result in its consumed credit count exceeding its credit limit. When the receiving device finishes processing the TLP from its buffer, it signals a return of credits to the sending device, which then increases the credit limit by the restored amount. The credit counters are modular counters, and the comparison of consumed credits to credit limit requires modular arithmetic. The advantage of this scheme (compared to other methods such as wait states or handshake-based transfer protocols) is that the latency of credit return does not affect performance, provided that the credit limit is not encountered, an assumption that is generally met if each device is designed with adequate buffer sizes.

First-generation PCIe is often quoted to support a data rate of 250 MB/s in each direction, per lane. This figure is a calculation from the physical signalling rate (2.5 Gbaud) divided by the encoding overhead (10bits/byte.) This means a 16 lane (x16) PCIe card would then be theoretically capable of 250 * 16 = 4 GB/s in each direction. While this is correct in terms of data bytes, more meaningful calculations will be based on the usable data payload rate, which depends on the profile of the traffic, which is a function of the high-level (software) application and intermediate protocol levels. Like other high data rate serial interconnect systems, PCIe has a protocol and processing overhead due to the additional transfer robustness (CRC and Acknowledgements). Long continuous unidirectional transfers (such as those typical in high-performance storage controllers) can approach >95% of PCIe's raw (lane) data rate. These transfers also benefit the most from increased number of lanes (x2, x4, etc.) But in more typical applications (such as a USB or Ethernet controller), the traffic profile is characterized as short data packets with frequent enforced acknowledgements. This type of traffic reduces the efficiency of the link, due to overhead from packet parsing and forced interrupts (either in the device's host interface or the PC's CPU.) This loss of efficiency is not particular to PCIe.

[edit] Form factors

  • Low height card
  • Mini Card: a replacement for the Mini PCI form factor (with x1 PCIe, USB 2.0 and SMBus buses on the connector)
  • ExpressCard: similar to the PCMCIA form factor (with x1 PCIe and USB 2.0; hot-pluggable)
  • XMC: similar to the CMC/PMC form factor (with x4 PCIe or Serial RapidI/O)
  • AdvancedTCA: a complement to CompactPCI for larger applications; supports serial based backplane topologies
  • AMC: a complement to the AdvancedTCA specification; supports processor and I/O modules on ATCA boards (x1,x2,x4 or x8 PCIe).
  • Cable Specification: PCI SIG is continuing (13sep2006) to work on a cable specification. This permits a PCIe card to be connected via a cable of varying lengths in the tens to low hundreds of meter range (v0.3 specification, 2004) with the same connection bandwidth as if plugged into the processor mainboard. The cable specification is of particular use for dis-aggregated PCs and space constrained devices such as laptops and blades, with other applications including interconnect and general I/O expansion.
  • Mobile PCI Express Module industry standard format for laptops created by NVIDIA.

[edit] Competing protocols

Several communications standards have emerged based on high bandwidth serial architectures. These include but are not limited to HyperTransport, InfiniBand, RapidIO, and StarFabric. There are industry proponents of each, and because significant funds have been invested in their development, each consortium tends to emphasize the advantages of its variant over others.

Essentially the differences are based on the tradeoffs between flexibility and extensibility vs. latency and overhead. An example of such a tradeoff is adding complex header information to a transmitted packet to allow for complex routing (PCI Express is not capable of this). This additional overhead reduces the effective bandwidth of the interface and complicates bus discovery and initialization software. Also making the system hot-pluggable requires that software track network topology changes. Examples of buses suited for this purpose are InfiniBand and StarFabric.

Another example is making the packets shorter to decrease latency (as is required if a bus is to be operated as a memory interface). Smaller packets mean that the packet headers consume a higher percentage of the packet, thus decreasing the effective bandwidth. Examples of bus protocols designed for this purpose are RapidIO and HyperTransport.

PCI Express falls somewhere in the middle, targeted by design as a system interconnect (local bus) rather than a device interconnect or routed network protocol. Additionally, its design goal of software transparency constrains the protocol and raises its latency somewhat.

[edit] Outlook

As of 2006, PCI Express appears to be well on its way to becoming the new backplane standard in personal computers. There are several explanations for this, but the principal reason is that it was designed to be completely transparent to software developers — an operating system designed for PCI can boot in a PCI Express system without any code modification. Other secondary reasons include its enhanced performance and strong brand recognition.

Almost all of the high end graphics cards being released today (2006) from ATi and NVIDIA use PCI Express. NVIDIA uses the high bandwidth data transfer of PCIe for its newly developed Scalable Link Interface (SLI) technology, which allows two graphics cards of the same chipset and model number to be run at the same time, allowing increased performance. ATi has also developed a dual-GPU system based on PCIe called CrossFire.

Most new Gigabit Ethernet chips and some 802.11 wireless chips also use PCI Express. Other hardware such as RAID controllers and network cards are also starting to make the switch.

ExpressCard is just starting to emerge on laptops. The problem is many laptops have only one slot and it is difficult to give that up for a new ExpressCard slot. Desktops do not have this problem as they have multiple slots and can more easily support PCI Express and the legacy PCI slots concurrently.

[edit] PCI Express 2.0

PCI Express is nearing completion of version 2.0 and should be available by early 2007. PCIe 2.0 doubles the bus standard's bandwidth to 5Gbps but remains compatible with the current specification, PCIe 1.1. The PCI-SIG also said PCIe 2.0 also features improvements to the point-to-point data transfer protocol and its software architecture.[1]

PCI Express 2.0 will be first introduced by Intel in the "Intel 3-series" chipsets such as P35 and G33 that will be introduced in the third quarter of 2007.

[edit] See also

[edit] References

  1. ^ PCI Express 2.0 final draft spec published

[edit] External links

Preceding: ISA, PCI, PCI-X, AGP
Subsequent: