Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

This is currently just a dump of ideas / discussions - probably very vague, and thin on content. To be shortly formatted well enough to be (hopefully) readable and understandable by others.

Cluster:

The cluster has a centralized coordinator service, and multiple masters and backups (clients for the coordinator). The coordinator consists of small group of nodes. At any given point in time, exactly one of them is the leader, and the others are followers. Only the leader services the requests to the coordinator.

General Coordinator Design:

Thread Safety:

We want the coordinator to be thread safe.

Modularity:

Pull as much logic out of Coordinator as possible, and have better* information hiding and modularity. Egs.:

  • TabletMap:
    • Examples: Atomic operations like addTablet, removeTabletsForTable, etc.
    • Done. Ryan's commit: fb0a623d3957a172314fbe0b53566a5699e1c0e6
  • Will:
    • Similar to TabletMap.
  • ServerList:
    • Examples:
      • Atomic operation to mark server crashed and update serverList version.
      • Any updates to serverList directly broadcasted to cluster without coordinator having to do it.

Fault Tolerance:

To make coordinator fault tolerant, the leader should persist information about requests being serviced in a highly reliable, consistent storage service - like LogCabin. If this leader fails at some point, one of the follower takes over as the leader, and can replay the log.

Encapsulating Information:

We encapsulate the above information in terms of state. The leader pushes the current state of the request being serviced to the log at distinct commit points. The state captures the changes to the local state of the coordinator, as well as the communication done / to be done with the external world (masters / backups in the cluster).

RPC thread and Replay thread:

 

Order of execution on recovery:

 

When replay for an action can't be completed:

 

Get recovery to do your work for you: Changing the definition of "done":

 

 

 


* For some definition of this word.

  • No labels