Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

A transaction is a set of operations that should happen atomically. Exactly which operations that may include is up for discussion, but think reads, writes, deletes, and possibly index lookups for now.

Warning: This page uses the term coordinator in the 2PC sense: the role of coordinating the execution of a transaction. This differs from Coordinator, the role of coordinating the RAMCloud cluster.

Contents:

Table of Contents

...

  • 1:A, 2:B, 3:C appear an extra time in the log (when they are masked).
  • A', B', C' appear an extra time in the log (in T).
  • Client clocks must be synchronized and/or other clients must always wait some amount of time before aborting a transaction
  • Blind (unconditional) modifications are no longer possible as the objects they operate on may be masked.
  • Having an object in the read-set of a transaction bumps its version number, so it invalidates everyone's cache.
  • Can only support creates using server-assigned object IDs if it is acceptable to burn object IDs when clients crash.
  • It'd be hard to do this safely if we need to consider access control. (We don't with namespaces.)

Optimization: Don't write placeholder T

If the transaction commits:

  1. Client reserves object ID for T
  2. Client adds masks to objects 1, 2, 3 which all point to T (T does not yet exist)
  3. Client creates T with (write A' at 1 over version V1, write B' at 2 over version V2, write C' at 3 over version V3)
  4. Client writes A' into 1, B' into 2, C' into 3
  5. Client deletes T

If some other client wants to abort T before step 3, the other client may create a tombstone at T's object ID. This blocks the create in step 3 and the coordinator will be forced to abort.

The client behaviors that follow:

  • If some other client finds an object masked by a committed transaction object, it can do the write-back.
  • If some other client finds an object masked by a tombstone, it can remove the mask.
  • If some other client finds an object masked by a missing transaction object, it can either wait for a transaction object to appear, or it can create a tombstone and then remove the mask.

The cleaning rules:

  • It is safe for anyone to delete a committed transaction object if all participating objects have been unmasked.
  • It is safe for the coordinator to delete a tombstone if it discards knowledge of that transaction ID.
  • And here is the gotcha: when can anyone delete a tombstone?
    • It is with high probability safe for anyone to delete a tombstone after a large amount of time has elapsed. This large amount of time (possibly measured in weeks) would have to convince us that the coordinator has died or observed the tombstone.
    • Or, don't clean the tombstones, but periodically delete the table and start using a different one, invalidating all previous transaction IDs.
    • Or, invalidate the coordinator's token.
    • Or, drop the table fragment instead of the whole table.

I think the main benefit is that there is one less write operation in the common case (this approach doesn't write the placeholder T). I think the main drawback is that cleaning tombstones is somewhat troubling and/or annoying.

John says:

My main question is whether it's worth the complications of tombstone manager to save the extra write, given that the transaction is already doing a lot of writes. Things feel a lot more obvious and safe with the first scheme.

Server-Side Transactions (2PC)

...

  • If a participant crashes before sending a commit acknowledgement (step 10):
    It comes back up and has no record of the transaction. When the master resends it the decision, it simply agrees.

Optimization: Client coordinates transaction

Ignacio pointed out that this is how Sinfonia does it, and it allows for better latency on the critical path. They block on memory node (master) failures so that the coordinator has to keep no state. We should explore the trade-offs.