Page Comparison

...

Machines will be used at least long enough such that new additions will be faster, have more memory, or otherwise differ in important characteristics
Need to be sure heterogeneity is taken into account in distribution
- e.g.: using DHT's and consistent hashing we can varying the key space a node is responsible for to be in proportion to its capabilities
Is purposeful heterogeneity useful?
- Lots of big, dumb, efficient nodes with lots of RAM coupled with some smaller, very fast, expensive nodes for offloading serious hotspots?
  - Or will high throughput/low latencies save us?
  - Alternative is to shrink responsibility of overloaded node to concentrate on hottest data
  - Perhaps there's a performance, energy, or cost win to being specifically heterogeneous?

...

...

chunks
- Useful for heterogeneity - chunk size <= smallest memory capacity, bigger servers responsible for more chunks
Variable vs. static chunk sizes
- Variable complicates mapping addresses to chunks, but:
  - permits squeezing an address range to break apart hot data
  - hot spots may grow increasingly unlikely with scale, but should we consider the low end?

Is it reasonable to expect to run within a virtualised environment?
- could imply much greater dynamism than we might be anticipating
  - high churn in joining/leaving DHT, lots of resultant swap in/out to maintain availability
- could also imply larger number of nodes than we expect, e.g.
  - let a hypervisor worry about multiprocessors
VMs may have significant latency penalties (though can be mitigated with PCI device pass-through, core pinning, etc)

Versions Compared