CAS latency
From Wikipedia, the free encyclopedia
This article or section is in need of attention from an expert on the subject. WikiProject Computer science or the Computer science Portal may be able to help recruit one. |
CAS is an abbreviation for column address strobe, or sometimes column address select, both referring to the column of the physical memory location in an array (comprised of columns and rows) of capacitors used in dynamic random access memory modules. Thus CAS latency (CL) is the time (in number of clock cycles) that elapses between the memory controller telling the memory module to access a particular column in the current row, and the data from that column being read from the module's output pins.
Data is stored in individual memory cells, each uniquely identified by a memory bank, row, and column. To access DRAM, controllers first select a memory bank, then a row (using the row address strobe, RAS), then a column (using the CAS), and finally request to read the data from the physical location of the memory cell. The CAS latency is the number of clock cycles that elapse from the time the request for data is sent to the actual memory location until the data is transmitted from the module. The data is organized bitwise. It is only assembled into bytes to meet the processor interface. Sometimes this happens on chip, and sometimes on the memory module. It is important to note that for all modern memory, many bits are accessed for each memory read. As an example, when DDR is read, a single read produces 64 bits of data. When discussing latencies, one should keep in mind that when talking about the time between bits, it is referring to the time from the appearance of the first group of bits until the appearance of the next group of bits. The following example is about what happens at the bit level.
The lower the CAS latency (given the same clock speed), the less time it takes to fetch data from it. So in novice terms, 'the lower CAS latency, the better it is.' The latency influences memory operation instructions, such as fetching next instruction for CPU execution, read/write/compare/bitshift/etc operations, etc. The larger timing, the more CPU have to 'wait' for memory to respond for request, so various solutions exist to work-around this problem whenever possible, like interleaving (allowing writes to be spread among few memory banks) or cache (allowing temporary storage of data in use and (sometimes) smart synchronization with memory modules)
Comparing between clock speeds gets trickier. CAS latency only specifies the delay between the request and the first bit. The clock speed specifies the latency between bits. Thus, when reading bursts of data, a higher clock speed can be faster in practice, even with a worse CAS latency. For example, consider a 133 MHz CL3 device (7.5 ns per cycle, 3 cycles request latency) versus a 100 MHz CL2 device (10.0 ns per cycle, 2 cycles request latency). The first bit would be available after 22.5 ns (7.5 ns * 3) on the CL3 device and after 20.0 ns (10.0 ns * 2) on the CL2 device, demonstrating the benefit of a lower CAS latency. However when reading a burst of even 4 bits, the higher clock speed wins: 45.0 ns (7.5 * 3 latency + 7.5 * 3 bits after the first) versus 50.0 ns (10.0 * 2 latency + 10.0 * 3 bits after the first).
In other words, on the CL2 device at a clock speed of 100MHz, accessing the first bit still costs 2 cycles, or 20ns (at a clock speed of 100MHz this takes 2*10ns = 20.0ns), and accessing the remaining three bits costs one cycle per bit, another 30ns (at a clock speed of 100MHz this takes 3*10ns = 30ns); giving a total of 50ns. Whereas, on the CL3 device, accessing the first bit costs 3 cycles, or 22.5ns (at a clock speed of 133MHz this takes 3*7.5ns = 22.5), and accessing the remaining three bits costs one cycle per bit, another 22.5 ns (at a clock speed of 133MHz this takes 3*7.5ns = 22.5); giving a total of 45ns. The higher clock speed, therefore, has a more significant effect in this situation.
Time to first bit
Mhz | CL | ns/cycle | total time (ns) |
---|---|---|---|
100 | 2 | 10 | 20 |
133 | 3 | 7.5 | 22.5 |
333 | 2.5 | 3 | 7.5 |
400 | 3 | 2.5 | 7.5 |
800 | 5 | 1.25 | 6.25 |
Time for 5 bits
Mhz | CL | ns/cycle | total time (ns) |
---|---|---|---|
100 | 2 | 10 | 60 |
133 | 3 | 7.5 | 52.5 |
333 | 2.5 | 3 | 19.5 |
400 | 3 | 2.5 | 17.5 |
800 | 5 | 1.25 | 11.25 |
Note that block copy/compare(search) operations usually operate on whole bytes though!
When a timing is specified for a particular CAS latency (e.g. CL3 = 5.0 ns, CL2.5 = 6.0 ns), that indicates the clock speed at which that CL is supported. In this example, the RAM could support CL3 at 200 MHz or CL2.5 at 166 MHz. Most RAM supports multiple clock speeds, with varying performance, hence this notation.
RAM with a higher rating can be installed into a system with a lower rating. For instance: 200 MHz rated RAM can be installed in a system with a 133 MHz memory bus; however, the RAM will subsequently run at 133 MHz. The ratings are best understood as speed limits, rather than running speeds. Therefore, installing faster RAM into a system will not necessarily result in a performance gain.
[edit] RAM Timing
RAM manufacturers typically list the recommended timing for their RAM as a series of four integers separated by dashes (e.g 2-2-2-6 or 3-3-3-9 or 4-4-4-12 and so on). While there are many other settings related to RAM, these four integers refer to the following settings, which are typically listed in this order: CL - TRCD - TRP - TRAS.
CL = CAS Latency time: The time it takes between a command having been sent to the memory and when it begins to reply to it. It is the time it takes between the processor asking for some data from the memory and it returning it.
TRCD = DRAM RAS# to CAS# Delay: The number of clock cycles performed between activating the Row Access Strobe and the Column Access Strobe. This parameter relates to the time it takes to access stored data.
TRP = DRAM RAS# Precharge: The amount of time between the 'precharge' command and the 'active' command. The 'precharge' command closes memory that was accessed and the 'active' command signifies that a new read/write cycle can begin.
RAS = Active to Precharge delay: The total time that will elapse between an active state and precharge state. This is the sum of the previous timings: CL + TRCD + TRP.
The BIOS on a PC may allow the user to make adjustments to RAM Timing in an effort to increase speed (with possible risk of decreased stability) or, in some cases, increase stability (by lowering the speed).
See dynamic random access memory, specifically the Synchronous Dynamic RAM (SDRAM) section.