Chapter 3 (continued)
II. Examples
- Slides
- Discussed boundary value problem: partitioning, agglomeration.
- Calculated parallel complexity of boundary value problem.
- Allow asymptotic comparisons of two algorithms. Points out
differences that are independent of implementation details.
- Allow predictions of performance as various paramters scale. Can
predict parallel performance:
- ... as n grows, everything else held constant.
- ... as p grows, everything else held constant.
- ... as p and n grow, with n/p held constant.
All three of these cases may be of interest, but only the third one
is of interest if you want to predict "scalability".
- Define parallel speedup (execution time on one processor over
execution time on p processors) and efficiency (speedup divided with p).
- Variations on speedup: isoefficiency.
- Introduce reductions, with "find max" example.
- Parallel reduction done naively is slower than sequential reduction.
- Look at binomial trees. They enable reduction at log(p) steps.
- Formally defined, a binomial tree of order 0 has 1 node. A binomial
tree of order k has a root with k children. Each child is the root of a binomial
tree of order k-1, k-2,...0.
- Binomial tree maps nicely to hypercube topology.
- Introduce N-body problem. N-body problem requires collective operations.
- Started discussion on gather and scatter, and implementations on full graphs
and hybercubes.
CS 4234,
Dimitris Nikolopoulos,
latest update: