CS 4234 Project 2
Project report due 5:00 p.m., December 6, 2004
Project demos to be scheduled December 6-8, 2004
Updated 11/10/2004
Notes:
1. Undergraduates may work on this project in teams of two; graduate students
must work alone.
2. As an alternative to the project described below, a few of you may want
to do something that you propose. Discuss this with Dr. Ribbens as soon as possible.
The goal of this project is to explore the issues involved with implementing
distributed shared memory (DSM). The idea of DSM is to give the programmer
the illusion of shared memory, even though the memory is physically a distributed
memory system, e.g., on a cluster. You will be implementing a very simple protype
of a DSM programming model, on top of MPI.
- (70 pts) Implement the following four functions, callable from the programming
language of your choice, using MPI "under the hood."
- int dsm_init(int length). Initializes a globally
shared virtual memory holding length integers,
evenly distributed by blocks, across the p processes
of the current MPI job. Initializes all entries to zero.
Assumes each process will hold k integers, where
k = length/p. Returns an error code if length is
less than p, or if the function is called more than once.
You may assume that all processes call dsm_init with the
same value of length.
- int dsm_put(int i, int n). Stores the value n at index
i of the DSM. Returns an error code if i
is illegal. This function returns when it is safe for the
calling program to modify the variable n.
Note that on return there is no gaurantee that the data has
actually been stored in the DSM at that point.
This is similar to the semantics of the MPI_Send call.
- int dsm_get(int i, int *n). Retrieves a value into n,
corresponding to the value stored in the DSM at index i.
Returns an error code if i is illegal.
This is a blocking call; it returns when the value is
retrieved from the DSM.
- int dsm_sync(). A global call that blocks all processes until
all preceding DSM operations have completed. Returns an error code
if no DSM has been initialized.
- int dsm_dump(char *filename). Prints a readable dump of the
current contents of the DSM to the indicated file. Returns an
error code if no DSM has been initialized. It should be possible to call
this function more than once. If you want to skip the argument and just
write to stdout, that's fine (see note below about changing the API).
Important notes:
- You will need to turn in a simple (hardcopy) report describing the
API as you implemented it. The description should include the definition
of the functions from a user's point of view, and a brief description of
how each function is implemented; pseudo-code might be a good way to do the
latter. Also, include a hardcopy of your code
implementing these functions.
- In this model nothing is said about synchronization or race-conditions
between processes. For example, if two processes update a particular location of the DSM,
unless a dsm_sync call is made in between, there is no way of being sure that
one process goes first.
- You may make slight modifications to the API described above if you
think it would make things easier to use or debug. The only
requirement is that you have a get, put and
sync function with the semantics described. So if you
want to change the arguments for init or dump,
or introduce a new function such as dsm_finalize(), that's fine.
- (20 pts) Describe and implement at least one noticeable improvement to the simple API, or
to its implementation, as described in part 1.
By "noticeable" we mean something that improves performance on realistic
memory access patterns. Present experimental results demonstrating the improvement from your
proposed changes to the model and/or its implementation.
- (10 pts) Describe, but do not implement, an additional improvement to the model and/or
its implementation. Give sufficient details to convince the reader that you know what you are
talking about. For example; you may explain motivation for your proposed change or extension;
describe the kinds of memory accesses that will benefit; describe how you would implement
it; what additional problems do you anticipate, if any?