Mechanism to detect "TX timeout is too short"

Description

Sometimes RAMCloud transaction need larger timeout.

  • for large transaction.

  • with slower network.

  • or some other reasons.

Too short transaction timeout causes transactions to abort always. Sometimes the race between commit and recovery causes non-determinism. With current log messages, it is not easy to tell whether application is stuck because the timeout is too short.
Since the whole point of transaction timeout is for cleaning up dead client, I think we may warn user with WARNING log and adjust timeout dynamically if we see commit messages from timed out clients. I think this can be more robust way than simply multiplying timeout with # of objs in a transaction since this will handle slow network as well.

Environment

None

Status

Assignee

Unassigned

Reporter

Seo Jin Park

Labels

None

Priority

Medium
Configure