Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • - Size of a shard can never grow beyond capacity of the largest machine
    • Might not always be able to store even when there is excess capacity in the system
    • Could do something hackish; have saturated host forward requests it doesn't have the entries for to another host with a different part of the same shard
  • - Nearly impossible to determine address range chunk size initially
  • - Nightmare if we decide we need a new address range chunk size
    • Requires "rehashing" the whole thing
  • - Adding and removing hosts
    • Some of these issues are addressable with linear hashing techniques.

DHT

Allows variable/fixed address space shards and/or variable/fixed space shards

  • + Simple
  • + Natural replication
    • Shard address space (needed for distribution anyway), multiple servers cover each shard
  • - Latency
    • Address to shard mapping has log(# shards) time in general, if shards are not fixed width in terms of # of addresses
    • Can be mitgated for index space tradeoff using radix tree or tries
    • How many levels is too deep?
    • Even in the face of cache misses? (what is cache miss cost)
      • At 3.0 GHz we get 3000 cycles per us
      • Cache miss seem to be about 60 cycles = 30 ns?
  • + Load sharing
  • - More difficult to co-locate related data on a single machine
    • Probably the case that we want to intentionally distrbute related data (more network overhead, but reduces latency because lookups happen on independent machines)

...