Ashish's Notes

1. Indexing Survey

2. Memory Allocation Design

We need to allocate to b+ tree nodes in the master server. B+ tree is represented by B+ tree nodes linked with pointers. A general structure of a B+ tree node is as follows:

struct node
{
int key;
int primary_keu; // only for leaf nodes
node* left_child;
node* right_child;
}

We can use malloc() to allocate a memory for a node. However, we need to make sure memory cleaning is efficient. A better way is to use the existing RamCloud memory management to allocate memory for nodes which is also responsible for cleaning. This gives an added advantage of simple recovery mechanism in case of index server failures. The current RamCloud memory management handles independent memory allocations and does not handle pointers.

We need to make sure that during cleaning, the node pointers are consistent i.e when a child is moved to a new location, the parent points to updated child location. There are two approach to handle this:

1) Parent Pointer: Having a parent pointer will enable us to reference the parent and change the child pointers during cleaning. However, this will have a space overhead and will require

2) Indirection: Another approach is to create a level of indirection between the parent's child pointers and child nodes i.e instead of accessing child nodes in single step through pointers, we will go through a hash table to find the corresponding child location. This leads to easier cleaning mechanism where we need to only update the hash table corresponding to a node, when a node is moved. We can use the existing hash table in the master server to store the index node information. The goal is to create a pseudo-table 'Index Table' which will not be accessible to the client sides, but will be handled by the sever. The table will be handled by RamCloud memory management like an usual table, which is responsible for cleaning and recovery.

The new structure of a B+ tree node will be

struct node
{
int key;
int primary_key; // only for leaf node
int left_child_key; // index table key
int right_child_key; // index table key
}

The trade-off is the general tree traversal operations will now be expensive, since each child access will have to go through the hash table. The computations will increase by atmost 2x (1us -> 2us).

Issues:
1) Maintainig Index table during splits
2) Storing the root node key (for global acess)

3. Indexing Schema Management

Table Manager:
The new table manager needs to store the index information.
Index: TableId + IndexId + IndexKey -> Index ServerId
Logical representation of index is a tree.
Internally represented as an ordered list of siblings where each sibling contants StartKeyHash.

RPCs for schema mgmt from client to coordinator

RPCs for schema mgmt from coordinator to masters

RPCs for schema information from coordinator to client (for lookups)