Home
 

Timing behaviour of PVM

About

This is the result of the programming assignment #2.

Exercise I

Creating a hostfile is simple. Just add the hosts you would like to include in your virtual machine to a file, one entry per line. For example:

sci-002.bgsu.edu
sci-003.bgsu.edu
sci-004.bgsu.edu
alpha.bgsu.edu

Then start the pvm with pvm <filename>.

See the logfile of the exercise session. Note that sci-003 failed to start for some reason.

Exercise II

This exercise contains lots of tedious work. The first thing I did was to write some scripts to support it. First I renamed timing.c to timing.c.in.

tim_mod.pl
This script reads timing.c.in, modifies the test parameters and writes timing.c.
runtest.sh
This script runs a full test on a machine.
eval_log.pl
Most of the output of runtest.sh is useless; this script creates a summary of the usefull facts.

Usage is simple:

  • Compile timing_slave on all hosts. (only once!)
  • Log into master host and setup PVM (adding hosts, etc).
  • Run runtest.sh <slave_host> > <logfile>
  • Run eval_log.pl <logfile> to get summary.
  • Halt pvm

Table I-A. Return trip times. Homogeneous Architecture A (SGI O2)

Table I-A
Master/Slave on same host
same host
msg size Raw Default InPlace
100 969 1068 1070
1000 1094 1050 1215
10000 2320 2671 1660
Logfile
Master/Slave on different hosts
different hosts
msg size Raw Default InPlace
100 2501 2513 2658
1000 3292 3308 3344
10000 14262 14263 14306
Logfile

2 users logged in

Table I-B. Return trip times. Homogeneous Architecture A (ALPHA)

Table I-B
Master/Slave on same host
same host
msg size Raw Default InPlace
100 2150 2541 2346
1000 2048 2097 2048
10000 (19502) 2926 3030 failed! (terminated)
Logfile
Master/Slave on different hosts
different hosts
msg size Raw Default InPlace
100
1000
10000

53 users logged in!!

Table I-C Return trip times. Homogeneous Architecture A (SGI MP)

Table I-C
Master/Slave on same host
same host
msg size Raw Default InPlace
100 929 1261 1344
1000 1047 1017 1257
10000 1388 1691 1451
Logfile
Master/Slave on different hosts
different hosts
msg size Raw Default InPlace
100
1000
10000

6 users logged in

Table I-D. Return trip times. Heterogeneous Architecture SGI O2 - ALPHA

Table I-D
Master/Slave on same host
same host
msg size Raw Default InPlace
100
1000
10000
Master/Slave on different hosts
different hosts
msg size Raw Default InPlace
100 2956 2873 2960
1000 4748 4710 12231
10000 31384 22473 19458
Logfile

Table II. Return trip times - Short message

Table II
Master/Slave on same host
same host
Arch Time (w / w/o first sample)
SGI O2 1686 uSec (1782/993)
SGI MP 1822 uSec (1783/957)
ALPHA MP 3798 uSec (3861/2452)
Master/Slave on different hosts
different hosts
Archs Time (w / w/o first sample)
O2 - O2 4635 uSec (16095/2478)
O2 - Sigma 3257 uSec (3261/2470)
O2 - Alpha 5736 uSec (6860/2934)

Data contained in Table I logfiles.

Time
Average over all tests which give this time, with first sample.
with first sample
Result of first test with first sample.
without first sample
Result of first test without first sample.

Table III. Data Packing times SGI O2

Table III
Master/Slave on same host
same host
msg size Raw Default InPlace
100 43 45 26
1000 49 134 29
10000 212 264 35
Master/Slave on different hosts
different hosts
msg size Raw Default InPlace
100
1000
10000

Data contained in Table I logfiles.

My observations

First, look at Table I-B, Master/Slave on same host, 10000 Byte message size. The Raw data test result seemed strange (value in parenthesis), so I redid the test. The InPlace test failed because of a process termination. Running the program under gdb leads to:

Program received signal SIGTERM, Terminated.
main (argc=1, argv=0x11ffffae8) at timing.c:118
118         if (pvm_recv (-1, -1) < 0) {

If you take a look at one of the logfiles you will see that the send time for the first message is much higher than the others. This is, as mentioned in classes, due to the fact that the server starts sending before the slave is spawned. This really is worth mentioning, because it has an impact on the test results. For example take SGI O2, homogenous arch, same host, short message round trip time. The resulting average is 1782 uSec with the first sample and 993 uSec w/o it!

Getting realistic results is very difficult because others are working in the system, in particular on Alpha. So there is a large spread in resulting time values.

Exercise III

Changes

I changed psum.c and spsum.c, and wrote the wrappers uni_psum.c and uni_spsum.c to fit the new source in the compile scheme.

Changes in psum.c

The following will, depending on nproc, spawn the slaves as required by the assignment. SGI6 is the architecture of the SGI O2s, ALPHAMP is Alpha and SGIMP64 is Sigma.

  switch(nproc) {
  case 10:
    numt = pvm_spawn(SLAVENAME, (char**)0, 0, "", nproc, tids);
    break;
  case 6:
    numt += pvm_spawn(SLAVENAME, (char**)0, PvmTaskArch, "SGIMP64", 2, tids+4);
  case 4:
    numt += pvm_spawn(SLAVENAME, (char**)0, PvmTaskArch, "ALPHAMP", 2, tids+2);
  case 2:
    numt += pvm_spawn(SLAVENAME, (char**)0, PvmTaskArch, "SGI6", 2, tids);
    break;
  default:
    printf("Requested number of slaves not implemented\n");
  }

This sends each slave it's portion to sum up.

#ifdef UNICAST
  printf("Using unicast send\n");
  for (i=0; i<nproc; i++) {
    low = i *((n/nproc)+1);
    high = low +((n/nproc)+1);
    if (high > n) {
      high = n;
    }
    send_len = high-low;

    pvm_initsend(PvmDataDefault);
    pvm_pkint(&i, 1, 1);
    pvm_pkint(&send_len, 1, 1);
    pvm_pkint(data+low, send_len, 1);
    pvm_send(tids[i], 0);
  }
#else
  ... broadcast code ...
#endif

Note that calculating the chunks size is different from the original source. The original would miss some numbers if (n/nproc) is not an integer.

Changes in spsum.c

Receive the data and sum it up.

#ifdef UNICAST
  printf("Using unicast send\n");
  pvm_upkint(&me, 1, 1);
  pvm_upkint(&n, 1, 1);
  pvm_upkint(data, n, 1);

/* calculate sum */
  result =0;
  for(i=0; i<n; i++) {
    result += data[i];
  }
#else
   ... broadcast code ...
#endif

Table IV. Timing values for psum, spsum application.

Table IV
Multicast Individual sends
number of slaves    
2 15184 uSec 16536 uSec
4 31450 uSec 29488 uSec
6 35478 uSec 28167 uSec

Questions

  1. What is the purpose of the hostfile?
  2. What conclusions can you draw from the data you gathered in exercise 2?
  3. What conclusions can you draw from the data you gathered in exercise 3?
  4. What issues do you think need to be taken into consideration in analyzing the above data sets?

Answers

Q1

The purpose of the hostfile is to configure a virtual machine. The simplest form is to just list the host members of the PVM; those will be added on startup. Options to configure the hosts, eg startup or working directory, may be supplied.

It may also be used to specify the options for the hosts without adding them. Those options will be used if the machine is added later.

Q2

Interpreting the data is very difficult because of the huge spread. In general, for large data sets Raw or even InPlace should be used if possible. For little amount of data it makes almost no difference.

Starting processes is slow. So spawn them as soon as possible and give them enough work.

Q3

Using more slaves only slows the computation down, because most of the work the system does is spawning processes and sending messages; very costly operations!

The result of Table III suggests that there is not much difference between a multicast and several unicasts. This is not suprising, because the amount of data to transmit stays almost the same.

Q4

These tests are run in a used network, ie network traffic and busy computers. Therefore results will be falsified.

First I trusted in the short message times; after talking about that topic in classes I took a second look at the numbers (see Table II). My conclusion: Ensure that you know what you are measuring.