Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Machines will be used at least long enough such that new additions will be faster, have more memory, or otherwise differ in important characteristics
  • Need to be sure heterogeneity is taken into account in distribution
    • e.g.: using DHT's and consistent hashing we can varying the key space a node is responsible for to be in proportion to its capabilities
  • Is purposeful heterogeneity useful?
    • Lots of big, dumb, efficient nodes with lots of RAM coupled with some smaller, very fast, expensive nodes for offloading serious hotspots?
      • Or will high throughput/low latencies save us?
      • Alternative is to shrink responsibility of overloaded node to concentrate on hottest data
      • Perhaps there's a performance, energy, or cost win to being specifically heterogeneous?

Sharding

...

  • Likely want to shard RAMCloud into

...

  • chunks
    • Useful for heterogeneity - chunk size <= smallest memory capacity, bigger servers responsible for more chunks
  • Variable vs. static chunk sizes
    • Variable complicates mapping addresses to chunks, but:
      • permits squeezing an address range to break apart hot data
      • hot spots may grow increasingly unlikely with scale, but should we consider the low end?

Virtualisation Interplay

  • Is it reasonable to expect to run within a virtualised environment?
    • could imply much greater dynamism than we might be anticipating
      • high churn in joining/leaving DHT, lots of resultant swap in/out to maintain availability
    • could also imply larger number of nodes than we expect, e.g.
      • let a hypervisor worry about multiprocessors
  • VMs may have significant latency penalties (though can be mitigated with PCI device pass-through, core pinning, etc)