Based on paper. The goal is to implement shared virtual memory on loosely coupled multiprocessors.
Definition (Memory Coherence)
Given a memory model and a set of possible instructions that execute at any point in time, we say that the memory model is coherent if the instructions execute in an ordering that is consistent with all viewers of the model (i.e. processors) at any given time. In other words, the value returned by a read operation is always as the value written by the most recent write operation to that location.
An architecture with one memory access path has no coherence problem. (Single computer, single threaded/single processor).
If our model is not coherent, we have to worry about cache coherence, invalidation, and other issues.
Definition (Strict Consistency Model)
This is one example of a memory model. In a strict consistency model, instructions for a provided CPU always execute in the order they were issued via some global clock. The load returns the value of the last (according to the global wall clock) store.
This is a very strong consistency model, and it is not practical for distributed systems
Global Clock
A global clock would require all machines/processors to be perfectly synchronized to some time source, which is not possible in practice due to network latency, clock drift, and other factors.
One way to implement a strict consistency model is to have a total ordering system for all memory operations, but this can lead to significant performance bottlenecks.
Definition (Shared Virtual Memory)
Shared virtual memory is a shared memory abstraction that allows processes running on different processors to access the same virtual memory address space. The idea is to allow processes to run on different processors in parallel.
- Data can migrate between processors on demand
- Natural and efficient form of process migration between processors in a distributed system.
- Memory mapping managers implement the mapping between local memory and shared virtual memory. The chief responsibility is to maintain coherence.
The main difficulty is solving the memory coherence problem: how to maintain a consistent view of memory across all processors.
flowchart TD; subgraph Node1 [Compute Node 1] CPU1[CPU 1] --- Mem1[Memory 1] end subgraph Node2 [Compute Node 2] CPU2[CPU 2] --- Mem2[Memory 2] end subgraph NodeN [Compute Node N] CPUN[CPU N] --- MemN[Memory N] end MM1[Mapping manager] MM2[Mapping manager] MM3[Mapping manager] SVM[Shared virtual memory] Mem1 <--> MM1 Mem2 <--> MM2 MemN <--> MM3 MM1 -.-> SVM MM2 -.-> SVM MM3 -.-> SVM
There are two design choices that influence the implementation of shared virtual memory:
- Granularity of the memory units (page size)
- Larger page size means less overhead for maintaining coherence, but higher chance for contention.
- Smaller pages means more overhead for maintaining coherence, but lower chance for contention.
- Coherence Strategy.
Definition (Sequential Consistency)
A memory model is consistent when any data load returns the value of the last (some possible total ordering of instructions) store. This is a relaxed version of the strict consistency model.
In general, the programmer does not necessarily need strict control over the ordering of instructions, but they do need to be able to reason about the ordering of instructions.
Definition (Locality of Reference)
Locality of reference is the principle that programs tend to access a relatively small portion of their address space at any given time. This is a key principle that allows for efficient caching and memory management. There are two types of locality:
- Temporal locality: If a program accesses a memory location, it is likely to access the same location again in the near future.
- Spatial locality: If a program accesses a memory location, it is likely to access nearby locations in the near future.
Memory Coherence Strategies
Strategies are classified by how they deal with page synchronization and page ownership.
Page Synchronization
There are two strategies: invalidation and write-broadcast.
In invalidation there is only one owner processor for each page. The processor has either write or read access to the page. If a processor has a write fault (wants to write to a page it does not own), to a page , its fault handler then
- invalidates all copies of on other processors (if any)
- changes the access of to to write access.
- moves a copy of to if necessary.
- returns control to .
can then write to and proceed. If as a read fault (wants to read a page it does not own), the handler
- changes the access of to read on the processor that has write access to
- moves a copy of to if necessary and sets the access of to read on .
- returns control to .
can then read from and proceed.
In write-broadcast, there a processor treats a read fault just like in the invalidation strategy. However, if has a write fault, the handler
- writes to all copies of the page
- returns control to .
The main problem with this is it requires special hardware support. Every write to a shared page needs to generate a fault on the writing processor to update all copies of the page.
In 1989, there was no such hardware support.
Page Ownership
The ownership of a page can be fixed or dynamic. In a fixed ownership strategy, each page is assigned to a specific processor. The owner processor is responsible for maintaining the coherence of the page.
- other processors cannot write to the page without going through the owner processor, which can lead to bottlenecks.
- constrains desired modes of parallel execution
In a dynamic ownership strategy, the ownership of a page can change over time. The owner processor is responsible for maintaining the coherence of the page while it owns it. There are two further types of dynamic ownership strategies: centralized and distributed. Distributed managers can be further classified as fixed or dyanmic.
graph TD; A[Memory Coherence Strategies] --> B[Page Synchronization]; A --> C[Page Ownership]; B --> D[Invalidation]; B --> E[Write-Broadcast]; C --> F[Fixed Ownership]; C --> G[Dynamic Ownership]; G --> H[Centralized Manager]; G --> I[Distributed Manager]; I --> J[Fixed Manager]; I --> K[Dynamic Manager];
| Page Sync Method | Fixed | Dynamic: Centralized Manager | Dynamic: Distributed Manager (Fixed) | Dynamic: Distributed Manager (Dynamic) |
|---|---|---|---|---|
| Invalidation | Not allowed | Okay | Good | Good |
| Write-broadcast | Very expensive | Very expensive | Very expensive | Very expensive |
Definition (Memory Consistency)
Memory Consistency focuses on the apparent ordering of reads and writes across multiple, different memory addresses as observed by different processors. This is different from memory coherence as the view is of multiple processors.
The scope is much larger