We're updating the issue view to help you get more done. 

PreparedOp and ParticipantList object not cleaned if not on participant server

Description

If PreparedOp or ParticipantList object end up on servers that aren't or are no longer participants in the transaction, the objects may never be garbage collected because transaction recovery will not target these servers.

This might happen if:

  1. a tablet is migrated while a transaction is in progress and the transaction's PreparedOps and ParticipantList log entries on the old master are not dropped after the migration completes.

  2. the prepare request is sent to the wrong server and the ParticipantList is added preemptively.

Possible solution: Have the InProgressTransaction garbage collection mechanism (that triggers after a transaction timeout) check to make sure the transaction belongs on the residing server.
Complications:

  1. Doing the check might be expensive; must check each entry in the ParticipantList to make sure at least one belongs to a tablet owned by the current master.

  2. Mechanism also needs to remove old PreparedOp object; there is currently no map from a transactionId to the objects.

Environment

None

Status

Assignee

Collin Lee

Reporter

Collin Lee

Labels

None

Components

Priority

Medium