/
Ideas for Packing Replicated Writes in RAMCloud

Ideas for Packing Replicated Writes in RAMCloud

Goal: Utilize the CPU time while write requests are waiting for replication Rpcs to return to do useful work (ie, handling other requests).

Ideally, we would like to handle new requests only as long as replication requests have not yet finished, and then prioritize running existing requests to completion as soon as they become unblocked.

Idea 1: Polling

Each worker thread handling a write request will issue its replication Rpcs, and then constantly yield until the responses come in.

If we assume FIFO ordering on the ready queue, then each new write request would need to wait for every existing write request to poll and then yield at least once before it gets a chance to run.

The badness of this depends on how many write requests we believe we can simultaneously service, since this is bottlenecked by the dispatch thread as well as the rate of replication.

Idea 2: Condition Variable Broadcast

Each worker thread will sleep on a common CV when waiting for a response to an Rpc, and when any response is received, there is a broadcast on that CV, and all worker threads wake up and check if it was for them. If it was not for them, they go back to waiting on the CV.

For every Rpc response received, every worker thread will waste work by waking up to check if the response was for them.

Idea 3: User-Controlled Thread Blocking and Unblocking

After each worker thread issues its replication Rpcs, it will invoke `getThreadId()` to get a handle to the current thread. It will then store the threadId into each of the n replication Rpcs.
Finally, it will call `block`, which will remove it the run queue and allow other threads to run.
When the dispatch thread receives a response to an Rpc, it will look up the threadId in the Rpc and call `wakeup` on the thread, which will put it back onto a run queue.

How is Idea 3 different from using a map to condition variables and then calling signal on the condition variables?

In the CV case, we have an extra layer of indirection, which could mean additional cache misses on data structure metadata. Also, CV's are centered around a state change, rather than oriented around waking up particular threads.