Page Comparison

...

Application chooses values for keys; if keys need to be unique, it's up to the application to figure how how to do this.
These longer keys must be passed around in requests.
The master needs a hash table.
There is some feature overlap with indexes. Now you need an extremely fast "primary index" on the master to look up objects.
This is what the application wants (e.g., I'm creating a table that will always be indexed by a URL) then this is more efficient than the system-allocated options where there would need to be a secondary index that would be used in virtually every lookup.
BigTable and memcached do it
Painful for allocating seq keys

User-allocated 64-bit keys

I think there's a consensus that this is not a large enough space to be used as a reasonable hash value, so the only real use case is ICQ.
what does "ICQ" mean? -JO
ICQ is an ancient chat program that identified users by a large integer. See http://en.wikipedia.org/wiki/ICQ#UIN . -Diego
Painful for allocating seq keys
Nice for migrating legacy apps
Decent memcache emulation
2 levels of hash collision to build a hash table

User-allocated 64-bit keys with system key generation assistance

Winner
Nice for legacy migration
Fast memcache emulation
2 levels of hash collision to build a hash table

System-allocated 64-bit keys with reuse

...

This has "sweet" semantics.
Key values increase strictly with time.
"Append" operations are simple: just create a new object. The order of events can be determined by comparing key values.
64-bit keys mean we don't have to worry about running out of the key space and wrapping around.
If keys are never deleted in a table then it's easy to enumerate the table's objects in chronological order. However, if keys are deleted the enumeration is not so easy: the obvious approach of stepping through each of the key values in order may spend a lot of time looking for objects that no longer exist.
Berthas would be the data structure of choice for tables that never have deletes. With deletes, other options are not so easy:
- Berthas with leaves collapsed to hash tables when they become sparse
  - Older objects that have many nearby deletes now incur a performance penalty.
  - Inner nodes of the Bertha still grow over time.
  - This option feels complicated because it combines a couple of different modes of operation.
- Berthas with leaves reused as they become sparse
  I don't understand how this option works -JO
  This one would be a little tricky to get right, and it's easier to explain with a diagram. Say there's a leaf of 8 entries that's only using entries 1,2, and 6 for keys 1201, 1202, and 1206. I can reuse this leaf starting at 2700 for 2700, 2703, 2704, 2705, and 2707 if I burn the keys 2701, 2702, and 2706. -Diego
  - Now a leaf can have multiple pointers coming into it
  - Some fraction of keys will have to be skipped over and not used
  - Splitting a key range to another server now requires unmerging the leaf.
  - Inner nodes of the Bertha still grow over time.
- Hash tables
Looks like memory

System-allocated (64-bit key, 64-bit generation number) tuples

...

Versions Compared

Old Version 11

New Version 12

Key

User-allocated 64-bit keys

User-allocated 64-bit keys with system key generation assistance

System-allocated 64-bit keys with reuse

System-allocated (64-bit key, 64-bit generation number) tuples