CS 4234 Homework 4

Due December 4, 2006

  1. Write a blocked version of matrix-matrix multiplication in OpenMP on ojibwa. Use a recursive block partitioning scheme, following the book guidelines. Test your code with a single input matrix size of 1024 by 1024. Use square blocks with minimum block sizes of 8 by 8, 16 by 16 and 32 by 32. Plot the performance of the code as Mflops (millions of floating point operations per second) versus the number of processors, for each of the block sizes and for 1, 2, 4 and 8 processors. Assume that the total number of floating point operations is known a-prior to be the cube of the size of the matrix dimension. Write down your observations and attempt to explain the results.

    To compile OpenMP code on ojibwa, you will be using the Intel C compiler. The compiler needs to be invoked with the -openmp flag, as icc -openmp file.c. To control the number of threads (and processors) to execute your code use EXPORT OMP_NUM_THREADS N.


CS 4234, Dimitris Nikolopoulos, Latest update: