MatPthread.doc Class hand-out
MatPthread.c
[.txt file] Code for the problem being discussed. It should open in a new browser window.
MatPthread.xls
Spreadsheet with run results
Discussion of segments
Initial declarations:
// POSIX thread stuff
#include <pthread.h>
// Used in LaunchThreads()
pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER;
// Shared memory area
typedef struct
{
double *A, *B, *C;
int NextRow, N1, N2, N3;
} *ProbPtr;
// Posix thread runs this
void *run (void* arg);
// after they are generated by this
void LaunchThreads (int Nproc, ProbPtr Job);
The main program invokes LaunchThreads to accomplish the parallel processing. First, though, it needs to generate the shared memory segment that will be used for communcations. Note that the parallel code segment begins and ends with code to capture the present system clock as wall-clock time, not processor time.
ProbPtr Prob = (ProbPtr) calloc (1, sizeof *Prob);
. . .
// Set up the problem structure
Prob->A = A;
Prob->B = B;
Prob->C = C1;
Prob->N1 = N1;
Prob->N2 = N2;
Prob->N3 = N3;
. . .
getTimes ( &Mid1, &dmy );
srand(Seed);
for ( run = 0; run < Nruns; run++ )
{ RandFill (A, N1*N2);
RandFill (B, N2*N3);
LaunchThreads (nSlaves, Prob);
}
#ifdef DEBUG
puts ("Finished with parallel."); fflush(stdout);
#endif
getTimes ( &Finish, &dmy );
Note that the run function both receives as parameter and returns as value a generic pointer. These can be cast into the appropriate struct pointer to access the data. The LaunchThreads function passes the pointer to a struct that provides access to the necessary arrays as well as other required data. If thread creation succeeds, the function then waits for all threads to terminate before returning to the main program.
void LaunchThreads (int Nproc, ProbPtr Job)
{ int proc;
pthread_t *thread_id = NULL; // Will be used by pthread_join
thread_id = (pthread_t*) calloc( Nproc, sizeof *thread_id );
Job->NextRow = 0; // New batch
for ( proc = 0; proc < Nproc; proc++ )
{ if ( pthread_create ( &thread_id[proc], NULL, run, (void*) Job )
!= 0 )
{ perror("Thread creation"); exit(-1); }
#ifdef DEBUG
printf ("Creation of thread %d succeeded.\n", proc); fflush(stdout);
#endif
}
// Wait for termination of all threads before exiting
for ( proc = 0; proc < Nproc; proc++ )
{ pthread_join ( thread_id[proc], NULL);
#ifdef DEBUG
printf ("Join to thread %d succeeded.\n", proc); fflush(stdout);
#endif
}
}
All threads share the same Job structure for the run function. It contains information on the current state of the calculation. Consequently access needs to be protected by a semaphore for "MUTual EXclusion" the global variable pthread_mutex_t mutex1
void* run ( void *arg )
{ int Row;
ProbPtr Job = (ProbPtr) arg; // Cast over to a Problem Pointer
double *A = Job->A, *B = Job->B, *C = Job->C;
int N1 = Job->N1, N2 = Job->N2, N3 = Job->N3;
while ( 1 ) // Will break out when all rows are done
{//ONE AT A TIME, get the next row to be processed.
pthread_mutex_lock( &mutex1 );
Row = Job->NextRow++;
pthread_mutex_unlock( &mutex1 );
if ( Row >= N1 )
break;
#ifdef DEBUG
printf ("Thread computing row %d of %d\n", Row, N1); fflush(stdout);
#endif
MatMult( A + Row*N2, B, C + Row*N3, 1, N2, N3 );
}
#ifdef DEBUG
puts ("Thread completing"); fflush(stdout);
#endif
return NULL; // We must return SOME kind of pointer
}
One can also examine page thrashing. Although the matrix multiplication code is written to maximize localized referencing of memory, that can be frustrated simply to changing the threads so that they compute the result one column at a time rather than one row at a time.
MatPthreadCol.c [.txt file] Code to compute in a column-wise fashion.
