Index API
These are mostly notes from the 2009-12-03 mtg.
Terminology
An index maps from keys to object identifiers (a.k.a. primary keys, but that's confusing in this context).
Note that this is distinct from the hash table that maps from OIDs to objects.
All 4 types of functions might appear in a RAMCloud application:
- Not injective, not surjective - Keys may correspond to more than one OID, and not every OID corresponds to a key.
- Not injective, surjective - Keys may correspond to more than one OID, but every OID corresponds to at least one key.
- Injective, not surjective - Every key corresponds to exactly one OID, but not every OID corresponds to a key.
- Injective, surjective (bijective; one-to-one) - Every key corresponds to exactly one OID, and every OID corresponds to exactly one key.
Multi indexes handle the not injective cases (1 and 2), while unique indexes handle the injective cases (3 and 4).
Requirements
Claim: Though multi indexes are more general, server-side unique indexes are also necessary
Consider everyone's favorite example of an employees table in which no two employees have the same employee ID, SSN, or username. I want to efficiently maintain this invariant in my application while inserting a new employee (and let's suppose other employees might be updated concurrently)
One approach is to first take a write-lock on the table. Lookup the new username in the multi index, and if it exists, abort. If it does not exist, proceed to insert the employee into the table and the username into the multi index. Then release the write-lock. However, serializing write requests to the table limits my application's scalability.
Alternative approaches are prone to race conditions.
A server-side unique index allows my application to atomically insert the index entry only if the key does not already exist, overcoming race conditions without locks.
Claim: Given both multi and unique indexes, an application can have all 4 types of index mappings efficiently.
If every object must correspond to a key, the application should never write an object that doesn't correspond to a key. This is easy to enforce with assertions on the client.
Claim: It is useful to know at index creation whether an index will be unique or multi.
A multi index is probably best implemented as a hash table or tree of key/list-of-OID pairs, while a unique index is probably best implemented as simply a hash table or tree of key/OID pairs.
Hypothesis: Range-queryable indexes are best implemented as trees, while other indexes are best implemented as hash tables.
We seem to think hash tables are faster, since we use them for the primary indexes. If this turns out not to be the case for secondary indexes, all indexes will be range-queryable.
Server-side API
Unique Indexes:
Insert(key, oid) -> err on duplicate key
Remove(key, oid) -> err on nonexistent key/OID pair
Lookup(key) -> oid or err on nonexistent key
Multi Indexes:
Insert(key, oid)
Remove(key, oid) -> err on nonexistent key/OID pair
Lookup(key) -> list of OIDs
Range Queries (same for unique and multi)
RangeQuery(key start (optional), bool inclusive, key end (optional), bool inclusive, int limit, oid start_after_oid (optional)) -> list of key/OID pairs or list of OIDs only (sorted by keys, then OIDs), whether more data was available and not returned
Exploring Relaxed Consistency
If we're OK with indexes possibly containing "extra" key/OID pairs for short periods of time, we can get away with cheaper algorithms to deal with updating multiple unique indexes.
For example, suppose we want to insert an employee with a unique SSN and a unique username where the table and each index are on separate machines.
If all goes well:
- Client sends insert-object request to master
- Master sends insert-index-entry-if-unique request to SSN index. Assume this succeeds.
- Master sends insert-index-entry-if-unique request to username index. Assume this succeeds.
- Master writes object to table.
- Master returns OK to client.
If the second index entry can not be inserted:
- Client sends insert-object request to master
- Master sends insert-index-entry-if-unique request to SSN index. Assume this succeeds.
- Master sends insert-index-entry-if-unique request to username index. Assume this one is taken.
- Master sends delete-index-entry request to SSN index to roll back step 2.
- Master returns error to client.
Insert in the unique case throws an error if there's a concurrent transaction with the same key
Lookup and RangeQuery return candidate OIDs