Speedup Results ================== Parameters: 10000 iterations (on a 128 by 128 grid) Speedups slaves total time (sec) barrier (sec) speedup 0 86 - - 2 177 91 0.49 4 147 65 0.59 8 140 56 0.61 16 165 65 0.52 Notes: Sharing the machine with 42 other people leads to falsified times. Why is this approach not optimal? ================================= Simple answer: A maximum speedup of 0.61,which even decreases with the number of slaves is not desireable. Long answer: First, much data needs to be copied to and from the slaves. This is done via shared memory, which is not too bad. But since all slaves share the data it should best be kept all the time in the shared memory only. Or, with PVM, send each slave only the required data. A static decomposition approach is used. When we have a small number of slaves, the speed of the slowest slave slows things down since the chunk for each slave is quite large. That's why the barrier waiting time in the master is quite large. As the chunk size decreases (and the number of slaves increases), the results get better. With a large number of slaves, one would expect the total time would increase again, because sending the whole data and gathering the results costs much. If this is the case, the barrier time should stay about the same. Why this program is worth some extra points =========================================== * Used master/slave approach * Implemented several different approaches - *broadcast* and single sent of current heat distribution - *gather* and single receive of results * MPEG movie of heat spread Included files ============== Program source heat0.tga,jpg initial heat distribution (0 iterations) heat10000.tga,jpg After 10000 iterations heat40000.tga,jpg After 40000 iterations heat50000.mpg 1001 frames animation of 50000 iterations Test Results ============ Serial Program -------------- slaves=0 t_inc=0.000100 t_final=1.000000 Total solving time: 86 sec Parallel Programs ----------------- slaves=2 t_inc=0.000100 t_final=1.000000 Actual number of slaves: 2 Total solving time: 177 sec Total barrier time: 91 sec barrier/solve ratio: 51.665626% slaves=4 t_inc=0.000100 t_final=1.000000 Actual number of slaves: 4 Total solving time: 147 sec Total barrier time: 65 sec barrier/solve ratio: 44.759598% slaves=8 t_inc=0.000100 t_final=1.000000 Actual number of slaves: 8 Total solving time: 140 sec Total barrier time: 56 sec barrier/solve ratio: 40.050311% slaves=16 t_inc=0.000100 t_final=1.000000 Actual number of slaves: 16 Total solving time: 165 sec Total barrier time: 65 sec barrier/solve ratio: 39.503814%