Decoupled architecture
From Wikipedia, the free encyclopedia
In computer science a decoupled architecture is a processor with out-of-order execution that separates the fetch and decode stages from the execute stage in a pipelined processor by using a buffer.
The buffers purpose is to partition the memory access and execute functions in a computer program and achieve high-performance by exploiting the fine-grain parallelism between the two. [1] In doing so it effectively hides all memory latency from the processors perspective.
A larger buffer can in theory increase throughput however if the processor has a branch misprediction then the entire buffer may need to be flushed wasting a lot of clock cycles and reducing the effectiveness. Furthermore larger buffers create more heat and use more die space. For this reason processor designers today favour a multi-threaded design approach.
Decoupled architectures are generally thought of as not useful for general purpose computing as they do not handle control intensive code well.[2] Control intensive code include such things as nested branches which occur frequently in operating system kernels.
They do however play an important role in scheduling in VLIW architectures.
[edit] References
[1] Smith, J.E. "Decoupled access/execute computer architectures", Computer Systems, ACM Transactions on; Volume 2, Issue 4, November 1984, Pages 289-308.
[2] Kurian, L.; Hulina, P.T.; Coraor, L.D.; "Memory latency effects in decoupled architectures". Computers, IEEE Transactions on Volume 43, Issue 10, Oct. 1994 Page(s):1129 - 1139.
[3] M. N. Dorojevets and V. Oklobdzija. Multithreaded decoupled architecture. Int. J. High Speed Computing, 7(3):465-- 480, 1995.