Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

These are mostly notes from the 2009-12-03 mtg.

Terminology

An index maps from keys to object identifiers (a.k.a. primary keys, but that's confusing in this context).

Note that this is distinct from the hash table that maps from OIDs to objects.

All 4 types of functions might appear in a RAMCloud application:

  1. Not injective, not surjective - Keys may correspond to more than one OID, and not every OID corresponds to a key.
  2. Not injective, surjective - Keys may correspond to more than one OID, but every OID corresponds to at least one key.
  3. Injective, not surjective - Every key corresponds to exactly one OID, but not every OID corresponds to a key.
  4. Injective, surjective (bijective; one-to-one) - Every key corresponds to exactly one OID, and every OID corresponds to exactly one key.

Multi indexes handle the not injective cases (1 and 2), while unique indexes handle the injective cases (3 and 4).

Requirements

Claim: Though multi indexes are more general, server-side unique indexes are also necessary

Consider everyone's favorite example of an employees table in which no two employees have the same employee ID, SSN, or username. I want to efficiently maintain this invariant in my application while inserting a new employee (and let's suppose other employees might be updated concurrently)

One approach is to first take a write-lock on the table. Lookup the new username in the multi index, and if it exists, abort. If it does not exist, proceed to insert the employee into the table and the username into the multi index. Then release the write-lock. However, serializing write requests to the table limits my application's scalability.

Alternative approaches are prone to race conditions.

A server-side unique index allows my application to atomically insert the index entry only if the key does not already exist, overcoming race conditions without locks.

Claim: Given both multi and unique indexes, an application can have all 4 types of index mappings efficiently.

If every object must correspond to a key, the application should never write an object that doesn't correspond to a key. This is easy to enforce with assertions on the client.

Claim: It is useful to know at index creation whether an index will be unique or multi.

...

Hypothesis: Range-queryable indexes are best implemented as trees, while other indexes are best implemented as hash tables.

...

Server-side API

...

Exploring Relaxed Consistency

...

  • No labels