Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

What should primary keys be?

The Proposed Server API page asks "What do we want keys to be?".
This page collects and evaluates the options.

User-allocated variable-length or long fixed-length keys

  • These longer keys must be passed around in requests.
  • The master needs a hash table.
  • There is some feature overlap with indexes. Now you need an extremely fast "primary index" on the master to look up objects.

User-allocated 64-bit keys

  • I think there's a consensus that this is not a large enough space to be used as a reasonable hash value, so the only real use case is ICQ.

System-allocated 64-bit keys with reuse

  • This is by far the easiest solution to implement. Berthas (multi-level indexes analogous to page tables) seem to be the data structure of choice.
  • Dangling pointers in the application are extremely dangerous. If your application gets back the same type of object as was originally identified by that key, it will not crash but rather it is likely to leak information.

System-allocated 64-bit keys without reuse

  • This has "sweet" semantics: assigned keys strictly increase with time.
  • We still don't have to worry about running out of the key space and wrapping around.
  • Berthas would be the data structure of choice for tables that never have deletes. With deletes, other options are not so easy:
    • Berthas with leaves collapsed to hash tables when they become sparse
      • Older objects that have many nearby deletes now incur a performance penalty.
      • Inner nodes of the Bertha still grow over time.
    • Berthas with leaves reused as they become sparse
      • Now a leaf can have multiple pointers coming into it
      • Some fraction of keys will have to be skipped over and not used
      • Splitting a key range to another server now requires unmerging the leaf.
      • Inner nodes of the Bertha still grow over time.
    • Hash tables

System-allocated (64-bit key, 64-bit generation number) tuples

System-allocated (48-bit key, 16-bit generation number) tuples

  • No labels