Computer architecture

In computer science and engineering, computer architecture is the practical art of selecting and interconnecting hardware components to create computers that meet functional, performance and cost goals and the formal modelling of those systems.

The noun computer architecture or digital computer organization is a blueprint, a description of the requirements and basic design for the various parts of a computer. It is usually most concerned with how the central processing unit (CPU) acts and how it accesses computer memory. Some currently (2011) fashionable computer architectures include cluster computing and Non-Uniform Memory Access.

The art of computer architecture has three main subcategories:^[1]

Instruction set architecture, or ISA. The ISA is the code that a central processor reads and acts upon. It is the machine language (or assembly language), including the instruction set, word size, memory address modes, processor registers, and address and data formats.

Microarchitecture, also known as Computer organization describes the data paths, data processing elements and data storage elements, and describes how they should implement the ISA.^[2] The size of a computer's cache for instance, is an organizational issue that generally has nothing to do with the ISA.

System Design includes all of the other hardware components within a computing system. These include:

Data paths, such as computer buses and switches
Memory controllers and hierarchies
Data processing other than the CPU, such as direct memory access (DMA)
Miscellaneous issues such as virtualization or multiprocessing.

The second step of designing a new architecture is often to design a software simulator, and write representative programs in the ISA, to test and adjust the architectural elements. At this stage, it is now commonplace for compiler designers to collaborate, suggesting improvements in the ISA. Modern simulators normally measure time in clock cycles, and give energy use estimates in watts.

Once the instruction set and microarchitecture are described, a practical machine needs to be designed. This design process is called the implementation. Implementation is usually not considered architectural definition, but rather hardware design engineering.

Implementation can be further broken down into several (not fully distinct) steps:

Logic Implementation — design of blocks defined in the microarchitecture at (primarily) the register-transfer level and logic gate level.
Circuit Implementation — transistor-level design of basic elements (gates, multiplexers, latches etc.) as well as of some larger blocks (ALUs, caches etc.) that may be implemented at this level, or even (partly) at the physical level, for performance reasons.
Physical Implementation — physical circuits are drawn out, the different circuit components are placed in a chip floorplan or on a board and the wires connecting them are routed.
Design Validation — The computer as a whole is tested to see if it works in all situations and all timings. Once implementation starts, the first design validations are simulations using logic emulators. However, this is usually too slow to run realistic programs. So, after making corrections, next, prototypes are constructed using field-programmable gate-arrays FPGAs. Many hobby projects stop at this stage. The final step is to test prototype integrated circuits. Integrated circuits may have to be redesigned several times to fix problems.

For CPUs, the entire implementation process is often called CPU design.

1 History
2 Computer architectures
3 Computer architecture topics
4 See also
5 Notes
6 References
7 External links

History

The term “architecture” in computer literature can be traced to the work of Lyle R. Johnson, Muhammad Usman Khan and Frederick P. Brooks, Jr., members in 1959 of the Machine Organization department in IBM’s main research center.

Johnson had the opportunity to write a proprietary research communication about Stretch, an IBM-developed supercomputer for Los Alamos Scientific Laboratory. In attempting to characterize his chosen level of detail for discussing the luxuriously embellished computer, he noted that his description of formats, instruction types, hardware parameters, and speed enhancements was at the level of “system architecture” – a term that seemed more useful than “machine organization.”

Subsequently, Brooks, one of the Stretch designers, started Chapter 2 of a book (Planning a Computer System: Project Stretch, ed. W. Buchholz, 1962) by writing, “Computer architecture, like other architecture, is the art of determining the needs of the user of a structure and then designing to meet those needs as effectively as possible within economic and technological constraints.”

Brooks went on to play a major role in the development of the IBM System/360 (now called the IBM zSeries) line of computers, where “architecture” gained currency as a noun with the definition as “what the user needs to know”. Later the computer world would employ the term in many less-explicit ways.

Computer architectures

There are many types of computer architectures:

Quantum computer vs Chemical computer
Scalar processor vs Vector processor
Non-Uniform Memory Access (NUMA) computers
Register machine vs Stack machine
Harvard architecture vs von Neumann architecture
Cellular architecture

The quantum computer architecture holds the most promise to revolutionize computing.^[3]

Computer architecture topics

Sub-definitions

Some practitioners of computer architecture at companies such as Intel and AMD use more fine distinctions:

Macroarchitecture — architectural layers that are more abstract than microarchitecture, e.g. ISA
Instruction Set Architecture (ISA) — as defined above minus
Assembly ISA — a smart assembler may convert an abstract assembly language common to a group of machines into slightly different machine language for different implementations
Programmer Visible Macroarchitecture — higher level language tools such as compilers may define a consistent interface or contract to programmers using them, abstracting differences between underlying ISA, UISA, and microarchitectures. E.g. the C, C++, or Java standards define different Programmer Visible Macroarchitecture — although in practice the C microarchitecture for a particular computer includes
UISA (Microcode Instruction Set Architecture) — a family of machines with different hardware level microarchitectures may share a common microcode architecture, and hence a UISA.
Pin Architecture — the set of functions that a microprocessor is expected to provide, from the point of view of a hardware platform. E.g. the x86 A20M, FERR/IGNNE or FLUSH pins, and the messages that the processor is expected to emit after completing a cache invalidation so that external caches can be invalidated. Pin architecture functions are more flexible than ISA functions - external hardware can adapt to changing encodings, or changing from a pin to a message - but the functions are expected to be provided in successive implementations even if the manner of encoding them changes.

The role of computer architecture

Computer architecture: the definition

The coordination of abstract levels of a processor under changing forces, involving design, measurement and evaluation. It also includes the overall fundamental working principle of the internal logical structure of a computer system.

It can also be defined as the design of the task-performing part of computers, i.e. how various gates and transistors are interconnected and are caused to function per the instructions given by an assembly language programmer.

Instruction set architecture

Main article: Instruction set architecture

The ISA is the interface between the software and hardware.
It is the set of instructions that bridges the gap between high level languages and the hardware.
For a processor to understand a command, it should be in binary and not in High Level Language. The ISA encodes these values.
The ISA also defines the items in the computer that are available to a programmer. For example, it defines data types, registers, addressing modes, memory organization etc.
Register are high Addressing modes are the ways in which the instructions locate their operands.

Memory organization defines how instructions interact with the memory.

Computer organization

Main article: Microarchitecture

Computer organization helps optimize performance-based products. For example, software engineers need to know the processing ability of processors. They may need to optimize software in order to gain the most performance at the least expense. This can require quite detailed analysis of the computer organization. For example, in a multimedia decoder, the designers might need to arrange for most data to be processed in the fastest data path and the various components are assumed to be in place and task is to investigate the organisational structure to verify the computer parts operates.

Computer organization also helps plan the selection of a processor for a particular project. Multimedia projects may need very rapid data access, while supervisory software may need fast interrupts.

Sometimes certain tasks need additional components as well. For example, a computer capable of virtualization needs virtual memory hardware so that the memory of different simulated computers can be kept separated.

The computer organization and features also affect the power consumption and the cost of the processor.

Design goals

The exact form of a computer system depends on the constraints and goals for which it was optimized. Computer architectures usually trade off standards, cost, memory capacity, latency and throughput. Sometimes other considerations, such as features, size, weight, reliability, expandability and power consumption are factors as well.

The most common scheme carefully chooses the bottleneck that most reduces the computer's speed. Ideally, the cost is allocated proportionally to assure that the data rate is nearly the same for all parts of the computer, with the most costly part being the slowest. This is how skillful commercial integrators optimize personal computers.

Performance

Modern computer architectural performance is often described as MIPS per MHz (millions of instructions per second per millions of cycles per second of clock speed). This metric explicitly measures the efficiency of the architecture at any clock speed. Since a faster clock can make a faster computer, this is a useful, widely applicable measurement. Historic complex instruction set computers had MIPs/MHz as low as 0.1 (See instructions per second). Simple modern processors easily reach near 1. Superscalar processors may reach three to five by executing several instructions per clock cycle. Multicore and vector processing CPUs can multiply this further by acting on a lot of data per instruction, and have several CPUs executing in parallel. Counting machine language instructions would be misleading because they can do varying amounts of work in different ISAs. The "instruction" in the standard measurements is not a count of the ISA's actual machine language instructions, but a historical unit of measurement, usually based on the speed of the VAX computer architecture. Historically, many people measured the speed by the clock rate (usually in MHz or GHz). This refers to the cycles per second of the main clock of the CPU. However, this metric is somewhat misleading, as a machine with a higher clock rate may not necessarily have higher performance. As a result manufacturers have moved away from clock speed as a measure of performance.

Computer performance can also be measured with the amount of cache a processor has. If the speed, MHz or GHz, were to be a car then the cache is like the gas tank. No matter how fast the car goes, it will still need to get gas. The higher the speed, and the greater the cache, the faster a processor runs.

Other factors influence speed, such as the mix of functional units, bus speeds, available memory, and the type and order of instructions in the programs being run.

In a typical home computer, the simplest, most reliable way to speed performance is usually to add random access memory (RAM). More RAM increases the likelihood that needed data or a program will be in RAM. So, the system is less likely to need to move memory data from the disk. The disk is often ten thousand times slower than RAM because it has mechanical parts that must move to access its data.

There are two main types of speed, latency and throughput. Latency is the time between the start of a process and its completion. Throughput is the amount of work done per unit time. Interrupt latency is the guaranteed maximum response time of the system to an electronic event (e.g. when the disk drive finishes moving some data).

Performance is affected by a very wide range of design choices — for example, pipelining a processor usually makes latency worse (slower) but makes throughput better. Computers that control machinery usually need low interrupt latencies. These computers operate in a real-time environment and fail if an operation is not completed in a specified amount of time. For example, computer-controlled anti-lock brakes must begin braking within a predictable, short time after the brake pedal is sensed.

The performance of a computer can be measured using other metrics, depending upon its application domain. A system may be CPU bound (as in numerical calculation), I/O bound (as in a webserving application) or memory bound (as in video editing). Power consumption has become important in servers and portable devices like laptops.

Benchmarking tries to take all these factors into account by measuring the time a computer takes to run through a series of test programs. Although benchmarking shows strengths, it may not help one to choose a computer. Often the measured machines split on different measures. For example, one system might handle scientific applications quickly, while another might play popular video games more smoothly. Furthermore, designers have been known to add special features to their products, whether in hardware or software, which permit a specific benchmark to execute quickly but which do not offer similar advantages to other, more general tasks.

Power consumption

Main article: low-power electronics

Power consumption is another design criterion that factors in the design of modern computers. Power efficiency can often be traded for performance or cost benefits. the typical measurement in this case is MIPS/W (millions of instructions per watt).

With the increasing power density of modern circuits as the number of transistors per chip scales (Moore's law), power efficiency has increased in importance. Recent processor designs such as the Intel Core 2 put more emphasis on increasing power efficiency. Also, in the world of embedded computing, power efficiency has long been and remains an important goal next to throughput and latency.

Notes

John L. Hennessy and David Patterson (2006). Computer Architecture: A Quantitative Approach (Fourth Edition ed.). Morgan Kaufmann. ISBN 978-0-12-370490-0. http://www.elsevierdirect.com/product.jsp?isbn=9780123704900.
Barton, Robert S., "Functional Design of Computers", Communications of the ACM 4(9): 405 (1961).
Barton, Robert S., "A New Approach to the Functional Design of a Digital Computer", Proceedings of the Western Joint Computer Conference, May 1961, pp. 393–396. About the design of the Burroughs B5000 computer.
Bell, C. Gordon; and Newell, Allen (1971). "Computer Structures: Readings and Examples", McGraw-Hill.
Blaauw, G.A., and Brooks, F.P., Jr., "The Structure of System/360, Part I-Outline of the Logical Structure", IBM Systems Journal, vol. 3, no. 2, pp. 119–135, 1964.
Tanenbaum, Andrew S. (1979). Structured Computer Organization. Englewood Cliffs, New Jersey: Prentice-Hall. ISBN 0-13-148521-0.

References

^ John L. Hennessy and David A. Patterson. Computer Architecture: A Quantitative Approach (Third Edition ed.). Morgan Kaufmann Publishers.
^ Laplante, Phillip A. (2001). Dictionary of Computer Science, Engineering, and Technology. CRC Press. pp. 94–95. ISBN 0849326915.
^ "Computer architecture: fundamentals and principles of computer design" by Joseph D. Dumas 2006. page 340.

External links

ISCA: Proceedings of the International Symposium on Computer Architecture
Micro: IEEE/ACM International Symposium on Microarchitecture
HPCA: International Symposium on High Performance Computer Architecture
ASPLOS: International Conference on Architectural Support for Programming Languages and Operating Systems
ACM Transactions on Computer Systems
ACM Transactions on Architecture and Code Optimization
IEEE Computer Society
http://www.cs.wisc.edu/~arch/www
http://www.codeproject.com/useritems/System_Design.asp - This approach allows beginners to easily break and design complex software systems.
The von Neumann Architecture of Computer Systems
A study guide for computer architects

Digital systems

Components	Logic gate Combinational logic Sequential logic Digital circuit Integrated circuit (IC)

Theory	Boolean logic Logic design Digital signal processing Computer architecture

Applications	Digital audio Digital photography Digital video Electronic literature