/
Coordinator Refactoring
Coordinator Refactoring
- Split the rpc handling functions (and thus, corresponding helper functions) in Coordinator Service into multiple logical groupings.
- CoordinatorTabletManager - manages tables, tabletMap
- createTable
- dropTable
- splitTablet
- getTableId
- getTabletMap
- reassignTabletOwnership
- CoordinatorServerManager - manages serverList
- enlistServer
- getServerList
- hintServerDown
- sendServerList
- setWill
- nothing extra (only call handlers)
- recoveryMasterFinished
- quiesce
- CoordinatorTabletManager - manages tables, tabletMap
- Think about how the functionality corresponding to each function gets spilt amongst different modules.
- All the rpc handlers - CoordinatorService
- Each handler:
- Receives / processes the rpc. - work1
- Then calls into the corresponding function residing in a separate module. - work2
- We will probably have another module, say CoordinatorServiceRecovery that does work workR
- Decide division of work between work1 and work2 (and similarly, between workR and work2). Options:
- work1 and workR function as dispatchers, real work done in work2.
- work1 - processes the rpc and passes the arguments to the appropriate function in work2.
- workR - iterates over the log, passes each entry to the appropriate function for work2.
- work2 - if the request is coming from workR, then it first processes the state to get the arguments. In all cases, it does all the real work.
- Split according to the the recovery path.
- work1 - everything upto (and including) the first time a log is written to logcabin. This is the work that will be only ever done by the current leader (not the followers or the recovering leader).
- workR - iterates over the log, passes each entry to the appropriate function for work2.
- work2 - if the request is coming from workR, then it first processes the state to get the arguments. It then does everything after work1.
- work1 and workR function as dispatchers, real work done in work2.
- Look at the above options from decision-hiding perspective:
- Option a (from above):
- Knowledge about format of rpcs - work1
- Decisions wrt how the log is read during recovery - workR (also knows how to read the opcode, but nothing more).
- Knowledge about format of log entries - work2
- Decision about when to write to log - work2
- Decisions wrt implementation of the function (in all the cases) - work2
- Problem: Putting all the decisions in one module is probably not the answer to good decision-hiding.
- Tombstone for prev problem: On the other hand, if decisions are related, then they should be in one place.
- Option b (from above):
- Knowledge about format of rpcs - work1
- Decisions wrt how the log is read during recovery - workR (also knows how to read the opcode, but nothing more).
- Knowledge about format of log entries - work1 & work2
- Decision about when to write to log - work1 & work2
- Decisions wrt implementation of the function (in all the cases) - work1 & work2.
- Problem: The work seems to be split flow-wise, not major decision wise.
- Decision from point 5: We're going with option a from point 4.