2D Heat Distribution
Assignment
Sample Heat Distribution
Report
Speedup Results
==================
Parameters: 10000 iterations (on a 128 by 128 grid)
Speedups
slaves total time (sec) barrier (sec) speedup
0 86 - -
2 177 91 0.49
4 147 65 0.59
8 140 56 0.61
16 165 65 0.52
Notes: Sharing the machine with 42 other people leads to falsified times.
Why is this approach not optimal?
=================================
Simple answer: A maximum speedup of 0.61,which even decreases with the number
of slaves is not desireable.
Long answer:
First, much data needs to be copied to and from the slaves. This is done via
shared memory, which is not too bad. But since all slaves share the data it
should best be kept all the time in the shared memory only. Or, with PVM, send
each slave only the required data.
A static decomposition approach is used. When we have a small number of
slaves, the speed of the slowest slave slows things down since the chunk
for each slave is quite large. That's why the barrier waiting time in the
master is quite large. As the chunk size decreases (and the number
of slaves increases), the results get better.
With a large number of slaves, one would expect the total time
would increase again, because sending the whole data and gathering
the results costs much. If this is the case, the barrier time should stay
about the same.
Why this program is worth some extra points
===========================================
* Used master/slave approach
* Implemented several different approaches
- *broadcast* and single sent of current heat distribution
- *gather* and single receive of results
* MPEG movie of heat spread
Included files
==============
Program source
heat0.tga,jpg
initial heat distribution (0 iterations)
heat10000.tga,jpg
After 10000 iterations
heat40000.tga,jpg
After 40000 iterations
heat50000.mpg
1001 frames animation of 50000 iterations
Test Results
============
Serial Program
--------------
slaves=0
t_inc=0.000100
t_final=1.000000
Total solving time: 86 sec
Parallel Programs
-----------------
slaves=2
t_inc=0.000100
t_final=1.000000
Actual number of slaves: 2
Total solving time: 177 sec
Total barrier time: 91 sec
barrier/solve ratio: 51.665626%
slaves=4
t_inc=0.000100
t_final=1.000000
Actual number of slaves: 4
Total solving time: 147 sec
Total barrier time: 65 sec
barrier/solve ratio: 44.759598%
slaves=8
t_inc=0.000100
t_final=1.000000
Actual number of slaves: 8
Total solving time: 140 sec
Total barrier time: 56 sec
barrier/solve ratio: 40.050311%
slaves=16
t_inc=0.000100
t_final=1.000000
Actual number of slaves: 16
Total solving time: 165 sec
Total barrier time: 65 sec
barrier/solve ratio: 39.503814%