Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Corrected links that should have been relative instead of absolute.

...

One approach is to first take a write-lock on the table. Lookup the new username in the multi index, and if it exists, abort. If it does not exist, proceed to insert the employee into the table and the username into the multi index. Then release the write-lock. However, serializing write requests to the table limits my application's scalability.

Alternative approaches are prone to race conditions...

A server-side unique index allows my application to atomically insert the index entry only if the key does not already exist, overcoming race conditions without locks.

Claim: Given both multi and unique indexes, an application can

...

have all 4 types of

...

index mappings efficiently.

If every object must correspond to a key, the application should never write an object that doesn't correspond to a key. ..This is easy to enforce with assertions on the client.

Claim: It is useful to know at index creation whether an index will be unique or multi.

...A multi index is probably best implemented as a hash table or tree of key/list-of-OID pairs, while a unique index is probably best implemented as simply a hash table or tree of key/OID pairs.

Hypothesis: Range-queryable indexes are best implemented as trees, while other indexes are best implemented as hash tables.

..We seem to think hash tables are faster, since we use them for the primary indexes. If this turns out not to be the case for secondary indexes, all indexes will be range-queryable.

Server-side API

Unique Indexes:

Insert(key, oid) -> err on duplicate key
Remove(key, oid) -> err on nonexistent key/OID pair
Lookup(key) -> oid or err on nonexistent key

Multi Indexes:

Insert(key, oid)
Remove(key, oid) -> err on nonexistent key/OID pair
Lookup(key) -> list of OIDs

Range Queries (same for unique and multi)

RangeQuery(key start (optional), bool inclusive, key end (optional), bool inclusive, int limit, oid start_after_oid (optional)) -> list of key/OID pairs or list of OIDs only (sorted by keys, then OIDs), whether more data was available and not returned

Exploring Relaxed Consistency

If we're OK with indexes possibly containing "extra" key/OID pairs for short periods of time, we can get away with cheaper algorithms to deal with updating multiple unique indexes.

For example, suppose we want to insert an employee with a unique SSN and a unique username where the table and each index are on separate machines.

If all goes well:

  1. Client sends insert-object request to master
  2. Master sends insert-index-entry-if-unique request to SSN index.

...

Exploring Relaxed Consistency

...

  1. Assume this succeeds.
  2. Master sends insert-index-entry-if-unique request to username index. Assume this succeeds.
  3. Master writes object to table.
  4. Master returns OK to client.

If the second index entry can not be inserted:

  1. Client sends insert-object request to master
  2. Master sends insert-index-entry-if-unique request to SSN index. Assume this succeeds.
  3. Master sends insert-index-entry-if-unique request to username index. Assume this one is taken.
  4. Master sends delete-index-entry request to SSN index to roll back step 2.
  5. Master returns error to client.

Insert in the unique case throws an error if there's a concurrent transaction with the same key

Lookup and RangeQuery return candidate OIDs