Data Sharing Patterns
User Interface
Pointers:
Sources: if interested contact
Isabelle Demeure
(demeure@inf.enst.fr)
Article written for Euromicro'95 (not available on the net).
tech_report.ps: Technical report (Postcript file).
Article written for Euromicro'95 (not available on the net).
Phosphorus is a DSM system developed on top of the Parallel
Virtual Machine system (PVM).
PVM supports the message-passing paradigm. Phosphorus supports the shared
data paradigm. As shown in the figure below, the two systems can be used
together for the support of parallel, distributed computations involving both
paradigms.
The Phosphorus project was started with three goals in mind:
We made the following choices:
Like PVM, Phosphorus is comprised of a daemon (called phosd) and a
library of functions. A daemon resides on each
machine making use of shared data. This daemon corresponds to the shared memory
manager who is in charge of keeping the shared data coherent.
The unit of sharing is the variable. We support the types supported by PVM
through the packing/unpacking functions. Shared arrays of these types may
also be declared.
The management of the shared variables is distributed among
a collection of servers running on the various hosts of the supporting
network;
each shared variable is ``owned'' by a server and the ownership changes
dynamically (which corresponds to the dynamic distributed scheme described
by Li and Hudak).
The challenge when building a DSM system is to provide the programmer with a
shared virtual address space without impacting too much the performance
of the system. Phosphorus was carefully designed to reduce the network
traffic necessary to maintain data coherent. Four data sharing patterns (or protocols) are adapted to suit four different variable access behaviors,
one of them corresponding to a relaxed memory consistency protocol (following
what was done in the Munin system).
The interface consists in a simple set of primitives for declaring, reading,
writing and synchronizing accesses to shared variables.
The design of our system was inspired by the Munin System. In particular we chose to implement similar data sharing protocols as in Munin, namely:
Read Only (Multiple Readers/Write once),
Migratory (Single Reader/Single Writer),
Conventional (Multiple Readers/Single Write) and
Write Shared (Multiple Readers/Multiple Writers).
The following table shows four combinations and the corresponding Phosphorus protocol. Two of the algorithms migrate data to the requeting host, and the two others replicate data so that multiple readers can access data locally (note that the "central" protocol is not implemented in Phosphorus).
Once a read-only variable has been initialized, no further updates occur.
Thus, the protocol simply consists of replication on demand with no write access rights. READ_ONLY protocol may also be classified as a Non-Migrating and Replicated algorithm.
A runtime error is generated when a task attempts to write to a READ_ONLY variable, and the variable has already been initialized.
The consistency protocol for migratory data, forwards the data to the next task that requested access to it, provides this task with read and write access, i.e. ownership, and invalidates the original copy.
At any time there can be several read-only copies of a conventional variable, but only one read-write copy of it. This protocol is also called ``multiple readers/single writer'' (MRSW).
A read operation on a host with no read access rights, causes a read fault. In this case, the faulting host has to communicate with the owner host to acquire a read-only copy. Following, the owner changes the access rights to read-only, if necessary, and sends a read-only copy to the reader hosts.
A write operation on a host with no write access rights, causes a write fault. In this case the ownership is transferred to the host where the write fault occurred. At this time, an invalidate message is sent to every host keeping a read-only copy before the write operation can complete.
Note that this protocol enforces sequential consistency.
Multiple copies with read-write access rights may reside on different hosts. This protocol, also called ``multiple readers/multiple writers'' (MRMW), is implemented with a Release Consistency Protocol: Eager Update Release Consistency Protocol (see below).
Each task may modify locally a portion of the variable. It is important for the programmer to be aware that each task must write to independent portions of the variable. Updates occur whenever a task acquires a lock or arrives at a barrier.
With the WRITE_SHARED protocol, the programmer must be aware that he or she is in charge of triggering the updates of the shared variables by calling phos_lock on a synchronisation variable or by forcing a rendez-vous at a barrier. Also note that locks should not be performed on WRITE-SHARED variables.
The Release Consistency Model
One of the problems raised by DSM systems is that of maintaining consistency between the various copies of a variable shared by several processes
This interface was inspired by the Munin system interface, which we found simple
and complete.
Every function returns 0 if successful.
Upon failure, a value different from 0 is returned with an error message.
int phos_init(void)This routine allows processes to enter the sharing service. It must be called after enrolling in PVM. The proper way to start a Phosphorus application is the following:
mytid = pvm_mytid(); /* enroll in pvm */
info = phos_init(); /* enroll in Phosphorus */
This call creates a new server phosd on the local host, if it was not already created by another task.
int phos_end(void)
phos_end(); /* exit DSM service */
pvm_exit(); /* exit PVM before stopping */
exit(0);
int phos_declare(int desc, int type, int count, int protocol)
BYTE=0 - CPLX=1 - DCPLX=2 - DOUBLE=3 - FLOAT=4 - INT=5
count refers to the number of items of type type. And the argument protocol determines the protocol associated with this variable (see the values below). A variable may be shared (READ_ONLY, CONVENTIONAL, MIGRATORY or WRITE_SHARED) or a synchronization variable (SYNCH).
READ_ONLY=1 - MIGRATORY=2 - CONVENTIONAL=3 - WRITE-SHARED=4
int phos_free(int desc)
int phos_read(int desc, void *buffer)
int phos_write(int desc, void *buffer)
int phos_ws_read(int desc, int offset, int length, void *buffer)
int phos_ws_write(int desc, int offset, int length, void *buffer)
int phos_lock(int desc)A phos_lock() performed on a synchronization variable is used for inter-task synchronization. In addition, if the user needs to access to a set of shared variables in mutual exclusion, a sequence of phos_lock()-phos_unlock() can be performed every time an access is to be done, to protect them. An example of a critical section is illustrated as follows:
phos_lock(SYNCH_VAR);
phos_read(VAR_1, &buffer_1);
phos_read(VAR_2, &buffer_2);
compute(buffer_1, buffer_2);
phos_write(VAR_1, &buffer_1);
phos_write(VAR_2, &buffer_2);
phos_unlock(SYNCH_VAR);
If the user needs only to protect one variable at a time, it is possible to lock a single shared variable. This guarantees that read and write operations are performed in sequence. For example:
phos_lock(SHARED_VAR);
phos_read(SHARED_VAR, &buffer);
compute(buffer);
phos_write(SHARED_VAR, &buffer);
phos_unlock(SHARED_VAR);
The user must make sure that a lock acquired by calling (phos_lock()) is always released by calling (phos_unlock()) in the same task that acquired it.
With the WRITE_SHARED protocol, the programmer must be aware that he or she is in charge of triggering the updates of the shared variables by calling phos_lock on a synchronisation variable or by forcing a rendez-vous at a barrier. Also note that locks should not be performed on WRITE-SHARED variables.
int phos_unlock(int desc)