Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Static Scalability
    • New installations can be created of many sizes: 1 machine, 10k machines, etc.
  • Dynamic Scalability
    • Existing installations must permit expansion - both incremental and explosive
      • Need to scale up as quickly as user requires - may be orders of magnitude in a few days
        • (Orran Krieger's Forum presentation - EC2 customer example)
      • Scaling down may be as important as scaling up
        • server consolidation may be important
          • regular: reduce active nodes during off-peak times (assuming we can maintain the in-memory dataset)
          • irregular: data center resources may be re-provisioned (to cut costs, handle reduced popularity, RAMCloud 2.0 is just too efficient, etc)

Addressing

  • 10,000 128GB nodes = 1.250 PB of storage
    • 1 PB = 2^50 bytes
    • Assuming average object size is 128 bytes:
      • 2^50 / 2^7 = 2^43 objects
      • => need at least 43 bits of address space (rounds up to 64-bit)
      • May want much larger key spaces, though:
        • if random keys, want less probability of collision, aid to distribution, etc
        • if structured keys, need bits for user, table, access rights, future/undefined fields, etc
      • So, we probably want 128-bit addressing, at the minimum.

Heterogeneity

  • Machines will be used at least long enough such that new additions will be faster, have more memory, or otherwise differ in important characteristics
  • Need to be sure heterogeneity is taken into account in distribution
    • e.g.: using DHT's and consistent hashing we can varying the key space a node is responsible for to be in proportion to its capabilities
  • Is purposeful heterogeneity useful?
    • Lots of big, dumb, efficient nodes with lots of RAM coupled with some smaller, very fast, expensive nodes for offloading serious hotspots?
      • Or will high throughput/low latencies save us?

Sharding

* Likely want to shard RAMCloud into chunks 

Virtualisation Interplay

  • Is it reasonable to expect to run within a virtualised environment?
    • could imply much greater dynamism than we might be anticipating
      • high churn in joining/leaving DHT, lots of resultant swap in/out to maintain availability
    • could also imply larger number of nodes than we expect, e.g.
      • let a hypervisor worry about multiprocessors
  • VMs may have significant latency penalties (though can be mitigated with PCI device pass-through, core pinning, etc)