Decoupled architecture

In computer architecture, a decoupled architecture is a processor with out-of-order execution: the instructions of the program it runs may not be run in the correct order, as long as the end result is correct. It separates the fetch and decode stages from the execute stage in a pipelined processor by using a buffer.

The buffer's purpose is to partition the memory access and execute functions in a computer program and achieve high-performance by exploiting the fine-grain parallelism between the two.[1] In doing so it effectively hides all memory latency from the processor's perspective.

A larger buffer can, in theory, increase throughput. However, if the processor has a branch misprediction then the entire buffer may need to be flushed, wasting a lot of clock cycles and reducing the effectiveness. Furthermore, larger buffers create more heat and use more die space. For this reason processor designers today favour a multi-threaded design approach.

Decoupled architectures are generally thought of as not useful for general purpose computing as they do not handle control intensive code well.[2] Control intensive code include such things as nested branches which occur frequently in operating system kernels. Decoupled architectures play an important role in scheduling in Very long instruction word (VLIW) architectures.[3]

See also

References

  1. Smith, J.E. "Decoupled access/execute computer architectures", Computer Systems, ACM Transactions on; Volume 2, Issue 4, November 1984, Pages 289-308.
  2. Kurian, L.; Hulina, P.T.; Coraor, L.D.; "Memory latency effects in decoupled architectures". Computers, IEEE Transactions on Volume 43, Issue 10, Oct. 1994 Page(s):1129 - 1139.
  3. M. N. Dorojevets and V. Oklobdzija. Multithreaded decoupled architecture. Int. J. High Speed Computing, 7(3):465-- 480, 1995.