Next: Chapter 11: Distributed Scheduling Up: No Title Previous: Chapter 3: Process Deadlocks

Chapter 9: Disributed File Systems

Goals

Design Issues

Case Study: Coda File System

for large scale distributed computing environment using UNIX workstations
resiliency to server and network failures
- server replication
- disconnected operation of clients when no server can be contacted

Coda Features

1.: Few trusted servers, many untrusted clients
2.: Clients cache entire files on local disks
3.: Cache coherence by callback: servers notify workstations of changes to cached files
4.: Clients dynamically map files to servers and cache this information
5.: Token-based authentication and end-to-end encryption

Replication Strategies

1.: Pessimistic: restrict updates to at most one partition
2.: Optimistic: updates allowed in every partition; detect and resolve conflicts after they occur

Optimistic strategy chosen because:

One-copy UNIX Semantics

Every modification to every byte of a file has to be immediately visible to every client

Lessons from Andrew File System

1.: Propagating changes at granularity of file opens and closes is adequate for virtually all applications
2.: Slightly weaker consistency guarantees are acceptable: if a callback from server to a client (or modification of the file by another client) is lost a client may continue to use a cached copy of the file for some time after that file has been changed elsewhere
3.: Client maintains information about a subset of servers that are currently accessible. If this set is empty: disconnected mode of operation.

Server Replication

Volume: unit of replication

Volume Storage Group (VSG):: set of servers with replicas of a volume (only a subset, AVSG, may be accessible)
Volume Replication Database:: stores the degree of replication and identity of replication sites of a volume (present at every server)

Strategy

Read-One, Write-All (ROWA)

before reading verify that server being read (preferred copy) has latest copy
if not: (i) change preferred site, (i) inform AVSG about existence of stale copies
establish callback with preferred server
on file close after modification transfer file in parallel to all servers in AVSG

Cache Coherence

try to contact missing members of VSG every $\tau$ seconds
on AVSG enlargement: next reference to any object goes to enlarged AVSG
shrinking AVSG:
- detected by probing members every $\tau$ seconds
- if preferred server gone, drop callback

Cache Coherence (contd.)

to detect updates missed by preferred server cache manager requests volume version vector (CVV) for every volume from which it has cached data
mismatch in volume CVV indicates some AVSG members have missed updates $\Rightarrow$ drop callback

Replica Management

State Transformation

Update:: extends update history; two-phase operation
Force:: copying updates from a dominant replica to a submissive replica
Repair:: used to return inconsistent replicas to normal use
Migrate:: saves copies of objects involved in unsuccessful updates resulting from disconnected operation for future repair

Next: Chapter 11: Distributed Scheduling Up: No Title Previous: Chapter 3: Process Deadlocks

Ravi Prakash
2000-04-22