Problem:
Conditional write RPC on Ramcloud returns the result by success or failure of the conditional write.
The flow of control of the client may depend on the result of conditional write.
If the master fails between succeeding the conditional write and responding back to client the result, current Ramcloud RPC protocol thinks the request was not delivered to client and retries the conditional write. In that case, recovered master already contains the new version of value (after the conditional write) and the retried request will be rejected since the version number doesn't match. Here, the correct response value should be success since the conditional write RPC was already done before the crash.
To handle this problem, I propose the following design.
In a client,
In a master server,
When a crash happens
Log cleaner
client_id | processed_list | |||
---|---|---|---|---|
1 |
| |||
2 |
| |||
3 |
|
Figure 1. Processed table: Each master keeps processed_lists for each client. The processed_list tracks the processed rpc_id (only not yet ACKed) by keeping sequence start and end.
... | cond_write <TableID, KeyHash> | ... | Tombstrone <table_id, keyHash> <Client_id, rpc_id, ack_id> | ... |
Figure 2. Log structure with RPC id and ACK id.