Make RPCs more robust.
- Hosts should be able to recover from the other end of a session failing.
- RPCs need to time out eventually (this should be reasonably aggressive).
Build a userspace 10GigE Driver for a strategically chosen NIC.
Update the log cleaner.
Enable the failure detector.
Handle coordinator failures.
Handle backup failures.
Handle multiple master failures and other secondary failures.
Cold boot.
- Backups need a superblock.
Make FastTransport fast.
Mechanism for splitting and moving tablets.
Threading?
Batteries for backup buffers?
- Use SSDs instead?
- Accept some data loss?
Network partitions?
Administration and diagnosis tools
- Table enumeration?

Option 3: Stripped-Down Key-Value Store

(This target would meet a lesser definition of "usable", probably only usable here at Stanford)

Make RPCs more robust.
- Hosts should be able to recover from the other end of a session failing.
- RPCs need to time out eventually (this should be reasonably aggressive).
Update the log cleaner.
Handle coordinator failures.
Handle backup failures.
Handle multiple master failures and other secondary failures.
Cold boot.
- Backups need a superblock.
Threading

Tasks deferred until later:

User-space 10GigE driver (just use Infiniband)
Enable failure detector (failure detection comes from clients)
Make FastTransport fast (just use Infiniband)
Mechanism for splitting and moving tablets
Non-volatile log buffers (allow data loss during datacenter-wide power failures)
Network partitions
Administration and diagnosis tools (implement only things that we desperately need, as they are discovered)
- Table enumeration?

Versions Compared