Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Corrected links that should have been relative instead of absolute.

This page builds on contents from the SOSP 2011 paper. In particular, readers should be comfortable with sections 3.5.1 Finding Log Segment Replicas and 3.5.2 Detecting Incomplete Logs.

...

Here's the best solution we know of so far (2012-01-18):

 

Suppose master M is writing object 100 to segment X on backups B1, B2, and B3 when backup B1 crashes. Then:

 

  1. Master M detects that B1 has crashed and confirms with the coordinator that B1 is unavailable. (It still makes sure object 100 has reached B2 and B3.)
  2. M opens segment Y on a new set of backups, with Y's log digest including segment X.
  3. M closes segment X on B2 and B3.
  4. M re-replicates X to a new backup, B4.
  5. M asks the coordinator to update its minimum acceptable open segment ID to Y. This is stored reliably in external storage.
  6. M acknowledges the client's write for object 100 and continues accepting writes.
During master recovery, the coordinator will only use open segments that have a segment ID greater than or equal to the minimum acceptable open segment ID for the master.

...