Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

These are Diego's meeting notes for 2011-03-30.

Goals of a Least Usable System

The ideal least usable system would be the smallest system that would:

  • Provide significant value for some set of users
  • Get us feedback from users
  • Be as close as possible to the path towards the final system

Option 1: Cache

  • memcached protocol compatibility
    • Implement their RPCs
      • Need operations like increment, prepend
      • They use 250-byte keys
    • Implement their TCP protocol, which should be very similar to that of TcpTransport.
      • Unfortunately, it appears to takes deep knowledge of the RPCs to figure out the number of bytes in the request.
    • Implement their simple UDP protocol. This should use the existing Driver interface and would be like a very stripped down version of FastTransport.
    • See http://code.sixapart.com/svn/memcached/trunk/server/doc/protocol.txt
  • Disable the coordinator.
    • Memcached libraries handle all necessary cluster management, so the coordinator is not necessary.
    • Masters should not enlist with the coordinator. They should instead start with exactly one table.
  • Build a userspace 10GigE Driver for a strategically chosen NIC.
  • Make RPCs more robust.
    • Hosts should be able to recover from the other end of a session failing.
  • Update the log cleaner (garbage collector).
    • We'll need an eviction policy for when masters fill up (memcached uses LRU).
      • One cheap and reasonable strategy might be to free the oldest segment in its entirety.

Option 2: Key-Value Store

  • Make RPCs more robust.
    • Hosts should be able to recover from the other end of a session failing.
    • RPCs need to time out eventually (this should be reasonably aggressive).
  • Build a userspace 10GigE Driver for a strategically chosen NIC.
  • Update the log cleaner.
  • Enable the failure detector.
  • Handle coordinator failures.
  • Handle backup failures.
  • Handle multiple master failures and other secondary failures.
  • Cold boot.
    • Backups need a superblock.
  • Make FastTransport fast.
  • Mechanism for splitting and moving tablets.
  • Threading?
  • Batteries for backup buffers?
    • Use SSDs instead?
    • Accept some data loss?
  • Network partitions?
  • Administration and diagnosis tools
    • Table enumeration?
  • No labels