A transaction is a set of operations that should happen atomically. Exactly which operations that may include is up for discussion, but think reads, writes, deletes, and possibly index lookups for now.

Warning: This page uses the term coordinator in the 2PC sense: the role of coordinating the execution of a transaction. This differs from Coordinator, the role of coordinating the RAMCloud cluster.

Contents:

Client-Side Transactions

Assuming initially object ID 1 contains A, 2 contains B, 3 contains C.

  1. Client writes placeholder T with object IDs (1, 2, 3)
  2. Client adds masks to objects 1, 2, 3 which all point to T
  3. Client updates T with (write A' at 1 over version V1, write B' at 2 over version V2, write C' at 3 over version V3)
  4. Client writes A' into 1, B' into 2, C' into 3
  5. Client deletes T

Consequences

Optimization: Don't write placeholder T

If the transaction commits:

  1. Client reserves object ID for T
  2. Client adds masks to objects 1, 2, 3 which all point to T (T does not yet exist)
  3. Client creates T with (write A' at 1 over version V1, write B' at 2 over version V2, write C' at 3 over version V3)
  4. Client writes A' into 1, B' into 2, C' into 3
  5. Client deletes T

If some other client wants to abort T before step 3, the other client may create a tombstone at T's object ID. This blocks the create in step 3 and the coordinator will be forced to abort.

The client behaviors that follow:

The cleaning rules:

I think the main benefit is that there is one less write operation in the common case (this approach doesn't write the placeholder T). I think the main drawback is that cleaning tombstones is somewhat troubling and/or annoying.

John says:

My main question is whether it's worth the complications of tombstone manager to save the extra write, given that the transaction is already doing a lot of writes. Things feel a lot more obvious and safe with the first scheme.

Server-Side Transactions (2PC)

  1. Client sends MT ("minitransaction") to master
  2. Master writes transaction object with list of participants, acquiring a txid
  3. Master sends txid, MT to all participants
  4. Participants lock objects and log MT
  5. Participants send vote to master
  6. If the decision is yes, the master notes it in the transaction object
  7. Master relays decision to participants
  8. Master sends response to client
  9. If the decision is yes, the participants commit. Otherwise, they unlock/roll back.
  10. Participants sends commit acknowledgement to master
  11. Master removes transaction object

Failure Scenarios

Optimization: Client coordinates transaction

Ignacio pointed out that this is how Sinfonia does it, and it allows for better latency on the critical path. They block on memory node (master) failures so that the coordinator has to keep no state. We should explore the trade-offs.