Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Another potential issue is writes received by a zombie after recovery starts, but before the zombie goes into limbo state (this is also an issue with the "traditional lease" approach). For example, suppose recovery starts but it takes a while for zombie to enter limbo state. It might receive write requests, which it accepts. In the worst case, these write requests could arrive after recovery masters have read the backup data for the segment, in which case the writes will be lost. One solution is for backups to refuse to accept data for a condemned master, returning a "Dude, you're a zombie" response in the same way as for PINGs. This would guarantee that, once recovery has progressed past the initial phase with where the coordinator contacts every server, no more writes will be processed by the condemned master. But, what if the zombie and some or all of the backups are in a separate partition? If the coordinator is able to contact any of the backups during recovery, then that will prevent the zombie from writing new data. If all of the backups are in the same partition as the zombie then the zombie will continue to accept write requests unaware of the partition. However, in this case the coordinator will not be able to recover since none of the replicas will be available, so recovery will be deferred until at least one of the replicas becomes available, at which point writes will no longer be possible.
  • OK, here's another scenario relating to zombies and writes: is it possible that in a partition a zombie could select backups for a new segment that are entirely within the disconnected partition? If this happens, is it possible that the coordinator might be unaware of the existence of that additional segment (it might think that the log ended with the previous segment) and complete recovery "successfully", unaware of the additional segment? If this situation were to occur, it would result in lost writes. Do our techniques for finding the head of the log prevent this situation from occurring?I think our approach to managing the log head will prevent this situation:
    • A master cannot start writing segment N+1 until it has closed segment N.
    • If it is condemned before it closes segment N then it will detect its condemned state during the close.
    • If it closes segment N before condemnation is complete, then the coordinator will not treat segment N as the head of the log, and it will not recover until it can find segment N+1. Once it finds segment N+1 the zombie will no longer be able to write to that segment.