Skip to end of metadata
Go to start of metadata
You are viewing an old version of this page. View the current version.
Compare with Current
View Page History
« Previous
Version 2
Next »
Recap
Assumptions
- Must be just as durable as disk-storage
- Durable when write request returns
- Almost always in a state of crash recovery
- Recovery instantaneous and transparent
- Records of a few thousand bytes, perhaps partial updates
Failures
- Bit errors
- Single machine
- Group of machines
- Power failure
- Loss of datacenter
- Network partitions
- Byzantine failures
- Thermal emergency
Possible Solution
- Approach #4: temporarily commit to another server's RAM for speed, eventually to disk
- some form of logging in RAM and batching writes to disk + checkpointing
- need to "shard" data for each server so that many servers serve as a backup for a single master to speed recovery time
- likely, backup shards will need to be able to temporarily become masters for the data while rebuilding the master
Issues
- Disk write bandwidth
- Best approach to achieve good write bandwidth on disk: have at least 3 disks, one for write logging, one archiving the last amount of log data, one for compaction. Using this scheme we can completely eliminate seeks which should give us about 100 MB/s. Unfortunately, we'll need RAID as well so the total is more than 3 disks just to achieve the write bandwidth of 1 disk.
Alternative
Recovery
- All soft state, fixes alot of consistency issues on recovery.
- Compare backups after a master is elected to check consistency, version numbers make this possible, only the most recently written value can be inconsistent, so we only need to compare k values for k backup shards. This is much cheaper than an agreement protocol for the backups.