Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Structured, Unstructured
  • Random, Hashes, Sequential
  • User-specified, generated
  • Need at least 2^48 capacity for objects
    • Hence, unstructured addresses probably need to be at least 2^64
    • 64 * 2^30 bytes/machine * 2^14 machines = 2^50 bytes, 2^50 bytes/2^7 bytes/obj = 2^48 2^43 objects

Sequential and Structured

...

  • Smaller ids (64-bit?)
    • Not if we want these to look like capabilities
  • Simple to make generation fast
  • Not meaningful to client (both a plus and minus)
  • Indexing must be done by clients and stored in the cloud
    • Akin to FriendFeed's setup
  • Content-based
    • Can't share objects without references
    • Less general
    • Potential vulns if hashes have weaknesses (good 128-bit hashes?)
    • Built-in de-duplication
      • Which also poses storage channel for multi-tenacy

...

  • How much metadata space is needed for all tables/applications?
    • Object Level
      • Up to
      2^48 objects, 128 bit (16 byte) key
      • 2^43 objects, size (4
      -8
      • bytes?), permissions or appid, tableid if not in address (
      2-
      • 4 bytes)
      • (2^43)*
      16
      • 8 =
      128 TB
      • 64 TB fully loaded (6.25% of capacity), not including the index size
  • How does metadata replication occur and what is the frequency?
    • On writes for object metadata
    • Shard Mappings
      • Lazily
      • Not sufficient when a client discovers a host is down
        • must update mappings in the new replicas at least very quickly
      • May additionally want leases or heartbeat or something similar as in MapReduce to ensure enough copies of shards are maintained on failure even if the data is cold

Approaches

addr mod servers

...