Speedup
From Wikipedia, the free encyclopedia
In parallel computing, speedup refers to how much a parallel algorithm is faster than a corresponding sequential algorithm.
Contents |
[edit] Definition
Speedup is defined by the following formula:
where:
- p is the number of processors
- T1 is the execution time of the sequential algorithm
- Tp is the execution time of the parallel algorithm with p processors
Linear speedup or ideal speedup is obtained when . When running an algorithm with linear speedup, doubling the number of processors doubles the speed. As this is ideal, it is considered very good scalability.
The speedup is called absolute speedup when T1 is the execution time of the best sequential algorithm, and relative speedup when T1 is the execution time of the same parallel algorithm on one processor. Relative speedup is usually implied if the type of speedup is not specified, because it doesn't require implementation of the sequential algorithm.
Efficiency is a performance metric defined as . It is a value, typically between zero and one, estimating how well-utilized the processors are in solving the problem, compared to how much effort is wasted in communication and synchronization. Algorithms with linear speedup and algorithms running on a single processor have an efficiency of 1, while many difficult-to-parallelize algorithms have efficiency such as that approaches zero as the number of processors increases.
[edit] Super linear speedup
Sometimes a speedup of more than N when using N processors is observed in parallel computing, which is called super linear speedup. Super linear speedup rarely happens and often confuses beginners, who believe the theoretical maximum speedup should be N when N processors are used.
One possible reason for a super linear speedup is the cache effect resulting from the different memory hierarchies of a modern computer: In parallel computing, not only the numbers of processors change, so does the size of accumulated caches from different processors. With the larger accumulated cache size, more or even all core data set can fit into caches and the memory access time reduces dramatically, which causes the extra speedup in addition to that from the actual computation.
Super linear speedups can also occur when performing Backtracking in parallel: One thread can prune a branch of the exhaustive search that another thread would have taken otherwise.
[edit] See also
[edit] References
Glen L. Beane, "The effects of Microprocessor Architecture on Speedup in Distributed Memory Supercomputers" (M.S. thesis, The University of Maine, 2004) http://www.umcs.maine.edu/~beaneg/docs/thesis.pdf
|