...
application (16 bit) | table (16 bit) | address (64 bit) |
- Sequential and Structured
- Temporal id locality
- Allocation could be tricky to make fast
- How many tables does a typical (or large) RDBMS have?
- How many applications do we expect to support in a single RAMCloud instance?
- How much metadata space is need for all tables/applications?
- How does metadata replication occur and what is the frequency?
- Random
- Smaller ids (64-bit?)
- Simple to make fast
- Not if we want these to look like capabilities
- Not meaningful to client (both a plus and minus)
- Indexing must be done by clients and stored in the cloud
- Akin to FriendFeed's setup
Approaches
Mapping
Tradeoff: Capacity, Throughput (via parallel requests) vs Latency (lookup time)
...
Ideal: 0 network messages and O(1) address to host mapping time with high probability
Implies all clients are aware of mapping.
Complication: access-control requires highly-consistent mapping replication if control is on addresses (e.g. the application/table is part of the structured address).
Objects may need to relocate due failures, load, or capacity.
DHT
- + Simple
- + Natural replication
- - Latency
- Address to shard mapping has log(# shards) time in general
- Can be mitgated for index space tradeoff using radix tree or tries
- How many levels is too deep? Even 2-3 in the face of cache misses?
- + Load sharing
- - More difficult to co-locate related data on a single machine
- Probably the case that we want to intentionally distrbute related data (more network overhead, but reduces latency because lookups happen on independent machines)
...