Date: Thu, 28 Mar 2024 13:44:01 +0000 (UTC)
Message-ID: <214811156.15.1711633441545@7b0d54798a5f>
Subject: Exported From Confluence
MIME-Version: 1.0
Content-Type: multipart/related;
boundary="----=_Part_14_307168215.1711633441545"
------=_Part_14_307168215.1711633441545
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Content-Location: file:///C:/exported.html
Coordinator Refactoring
Coordinator Refactoring
- Split the rpc handling functions (and thus, corresponding helper functi=
ons) in Coordinator Service into multiple logical groupings.
- CoordinatorTabletManager - manages tables, tabletMap
- createTable
- dropTable
- splitTablet
- getTableId
- getTabletMap
- reassignTabletOwnership
- CoordinatorServerManager - manages serverList
- enlistServer
- getServerList
- hintServerDown
- sendServerList
- setWill
- nothing extra (only call handlers)
-
- recoveryMasterFinished
- quiesce
- Think about how the functionality corresponding to each function gets s=
pilt amongst different modules.
- All the rpc handlers - CoordinatorService
- Each handler:
- Receives / processes the rpc. - work1
- Then calls into the corresponding function residing in a separate modul=
e. - work2
- We will probably have another module, say CoordinatorServiceRecovery th=
at does work workR
- Decide division of work between work1 and work2 (and similarly, between=
workR and work2). Options:
- work1 and workR function as dispatchers, real work done in work2.
- work1 - processes the rpc and passes the arguments to the appropriate f=
unction in work2.
- workR - iterates over the log, passes each entry to the appropriate fun=
ction for work2.
- work2 - if the request is coming from workR, then it first processes th=
e state to get the arguments. In all cases, it does all the real work.=
- Split according to the the recovery path.
- work1 - everything upto (and including) the first time a log is written=
to logcabin. This is the work that will be only ever done by the current l=
eader (not the followers or the recovering leader).
- workR - iterates over the log, passes each entry to the appropriat=
e function for work2.
- work2 - if the request is coming from workR, then it first processes th=
e state to get the arguments. It then does everything after work1.
- Look at the above options from decision-hiding perspective:
- Option a (from above):
- Knowledge about format of rpcs - work1
- Decisions wrt how the log is read during recovery - workR (also knows h=
ow to read the opcode, but nothing more).
- Knowledge about format of log entries - work2
- Decision about when to write to log - work2
- Decisions wrt implementation of the function (in all the cases) - work2=
- Problem: Putting all the decisions in one module is probably not the an=
swer to good decision-hiding.
- Tombstone for prev problem: On the other hand, if decisions are related=
, then they should be in one place.
- Option b (from above):
- Knowledge about format of rpcs - work1
- Decisions wrt how the log is read during recovery - workR (also knows h=
ow to read the opcode, but nothing more).
- Knowledge about format of log entries - work1 & work2
- Decision about when to write to log - work1 & work2
- Decisions wrt implementation of the function (in all the cases) - work1=
& work2.
- Problem: The work seems to be split flow-wise, not major decision wise.=
- Decision from point 5: We're going with option a =
from point 4.
------=_Part_14_307168215.1711633441545--