...
- A client reserves sequence numbers for RPC ids. It reserves M+1 consecutive ids, where M is the number of objects involved in the current transaction. The lowest seq# is not assigned to any object or RPC and work as placeholder. Other M sequence numbers are assigned to each object.
- RPC (1) PREPARE: A client sends prepare messages to all data master servers participating transaction. For understandability, we send a separate RPC request for each object in transaction.
- Request msg: <list of <tableId, keyHash, Seq#>, tableId, key, condition, newVal>
- Request msg: <list of <tableId, keyHash, Seq#>, tableId, key, condition, newVal>
...
- list of <tableId, keyHash, Seq#>: used in case of client disconnection.
...
- TableId & Key: object operating on.
...
- Condition: condition for COMMIT-VOTE other than successful locking. RAMCloud RejectRules. This can be NULL.
...
- newVal: value to be written for “key” on the receipt of “COMMIT”.
- Handling:
...
- Grab a lock for “key” on lock table. Buffer newVal for the key.
...
- - If the lock was grabbed & condition is satisfied, log LockRecord (lock information. See figure~\ref{fig:lockRecord}) and RpcRecord (linearizability. See figure~\ref{fig:rpcRecord})
-
- - If the lock was grabbed & condition is satisfied, log LockRecord (lock information. See figure~\ref{fig:lockRecord}) and RpcRecord (linearizability. See figure~\ref{fig:rpcRecord})
...
- If grabbed the lock but condition is not satisfied, unlock immediately, and log RpcRecord with the result of “ABORT-VOTE”
-
- If grabbed the lock but condition is not satisfied, unlock immediately, and log RpcRecord with the result of “ABORT-VOTE”
...
- If we failed to grab the lock, log RpcRecord with the result of “ABORT-VOTE”.
...
- Sync log with backup.
- Response: either “COMMIT-VOTE” or “ABORT-VOTE”.
- RPC(3) DECISION: After collecting all votes from data masters, the client broadcast its decision to all cohorts.
- Request: <tableId, keyHash, seq# for PREPARE, DECISION>
- Handling: if DECISION = COMMIT,
...
- If there is a buffered write, log Object (with new value), Tombstone for old Object, and Tombstone for LockRecord atomically.
...
- Unlock the object in lock table.
...
- Sync log with backup…? Is it safe on master crash?
- Sync log with backup…? Is it safe on master crash?
...
- (It is okay to delay sync until we sync a next transaction’s LockRecord. We only need a guarantee that only one LockRecord exists per object.)
- Response: Done.
- After collecting “Done” from all cohorts, the client acknowledge the lowest seq# reserved, so that ACK# can reach up to the highest seq# used in this transaction.
Discussions & Questions:
...