Project 2 - OpenMP Application
Introduction
The primary purpose of this assignment is to gain experience with multicore parallelism using the OpenMP library. The secondary purpose is to gain experience parallelizing dense matrix operations.
Description
You will develop an optimized, parallel version of the following serial
program: gauss.c
Your parallel version
should have the filename par_gauss.c
.
The program generates a random system of linear equations in the form of a coefficient matrix (A) and a right-hand-side column vector (b). The size of the system is given as a command-line parameter to the program. It then solves the system using Gaussian elimination and backwards substitution. The system is generated in such a way the solution (x) will be a column vector with all values equal to 1.0, making it easy to check the answers. This assignment is similar to programming assignments 5.4 and 5.5 in your textbook.
The serial code contains several functions as well as a few preprocessor directives to toggle program features on/off. For this assignment, you must efficiently parallelize (using OpenMP) all functions that contribute significantly to the running time of the program. You should not use Pthreads, semaphores, or any other concurrency/parallelization system. You may perform minor re-writes within functions in order to be able to more effectively parallelize them, but you should avoid any rewrites that cut across multiple functions or that fundamentally change the nature of the function.
In your code documentation, you should include a discussion of your changes and why you made them. This assignment is far more open-ended than the first project, and the amount of code you actually write could be rather small in the end. Thus, you must include documentation explaining your experiments and the reasoning behind your final submission. You should include some performance results and an argument as to why your version is the optimal way to parallelize the original code. IMPORTANT: "I tried it and it didn't run faster" is not on its own a sufficient explanation! If you were unable to parallelize a particular function, explain what you tried and discuss reasons why it may not have worked well. If possible, suggest ways to improve the code to increase parallelism.
For your final submission, DEBUG
and
GENERATE_TRIANGULAR
must be disabled, and REAL
must be set to "double
"!
Use the DEBUG
directive to toggle debug output. This output
will likely be unhelpful for matrices larger than 10x10. Use the
GENERATE_TRIANGULAR
directive to generate the coefficient matrix
already in upper-triangular form, removing the need to do Gaussian elimination.
Use the USE_COLUMN_BACKSUB
directive to enable the alternative back
substitution method (column-oriented vs. the standard row-oriented). Finally,
you can change the REAL
macro if you wish to experiment with using
single precision instead of double precision.
You should use omp_get_wtime
instead of clock()
to
time your OpenMP version.
You should use compiler directives to disable all OpenMP code if the
compiler does not support it. (HINT: check for "_OPENMP
") I suggest
creating a Makefile that builds three different programs: 1) the original serial
version, 2) the new parallel version, and 3) the new parallel version with
OpenMP disabled.
Your program should respect the OMP_NUM_THREADS
environment
variable; there is no need to accept the number of threads as a command-line
parameter.
You may use the cluster to do development and testing. However, please be considerate and do not run your tests on the login node directly. I suggest writing a test script that runs your program using various thread counts, then converting this script into a SLURM job script so that it can run on the compute nodes.
If you would like to test with some custom (non-randomized) matrices, you
can pass a file to the program instead of a system size. I have supplied a
function (read_system
) that handles the file I/O. The files should
contain a line at the top with a single integer n (the size of the system).
Subsequent lines should have rows of the augmented matrix [A][b].
For example, here is the representation of this system (taken from the Wikipedia article):
3 3.0 2.0 -1.0 1.0 2.0 -2.0 4.0 -2.0 -1.0 0.5 -1.0 0.0
The solution to this system is [ 1.0 -2.0 -2.0 ].
HINT: Test with larger matrices! You won't begin to see true performance trends until you start testing on matrices larger than 500x500. I suggest testing with even larger matrices (2000x2000 or 5000x5000 or more).
Grading Criteria
Your submission will be graded on a combination of correctness, functionality, elegance, style, and documentation. Here is a tentative grading rubric:
Grade | Description | Requirements |
---|---|---|
A | Exceptional |
|
B | Good |
|
C | Satisfactory |
|
D | Deficient |
|
F | Unacceptable |
Submission
You should copy gauss.c
into a new file called
par_gauss.c
and modify it to implement the system described above.
Your modified program should use the OMP_NUM_THREADS
environment
variable for scaling. Your program must compile on compliant systems using the
following Makefile:
default: gauss par_gauss gauss: gauss.c gcc -g -O2 --std=c99 -Wall -o gauss gauss.c par_gauss: par_gauss.c gcc -g -O2 --std=c99 -fopenmp -Wall -o par_gauss par_gauss.c clean: rm -f gauss par_gauss
You should submit your modified program as par_gauss.c
on
Canvas by the due date. You should not submit your Makefile or anything else.
All submitted code should be elegant, clean, consistent, and well-documented. The code I have provided uses 4-space indentation and the 1TBS style. If you choose to reformat the file to match your own style preferences, make sure you apply the changes throughout.
Peer Reviews
One of the goals of this course is to encourage good software development practices, especially when building a parallel or distributed software system. For this project, you will be assigned two other random students in the course. You must review their code and offer constructive feedback according to the given rubric.
After the submission deadline, I will assign you two other submissions to review on Canvas. I expect you to review two distinct submissions that are not your own; I will do my best to ensure that the assignments work out but sometimes Canvas does not cooperate. It is your responsibility to inform me immediately if you have been assigned yourself, your partner, or two other same-team partners to review.
To complete the peer review, submit a Canvas comment on the submission with your assessment of the submission according to the provided code review guidelines. I expect at least two paragraphs per review with detailed observations and an assigned overall numeric score. You will be graded on the quality, thoroughness, and professional tone of your review. The peer reviews are due one week after the original project due date.