Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

WIP distillation of thoughts on DCFT.

Python MapReduce Notes

Interesting that my implementation also had three levels: Master/Scheduler (Job), TaskWrapper (Task), and Task (TaskAttempt).  I used a "nested rules" approach but in reality it is likely that it could have been collapsed in one top level rules set.  I should mention there is a great deal of similarity between a MapTask and a ReduceTask in terms of it's rules.  How do you modularize Task classes?  Can you subclass a Task class?  The below table does not include the RPC rules.  The table also does not account for the rules for the membership service (though that likely should be a different module).

ModuleRule CountComments
Master/Scheduler1Basically just the rule to prevent deadlock by preempting tasks
MapTaskSet2For straggler reissue.
MapTask/Wapper5 
ReduceTaskSet0Just pool management.
ReduceTask4 
Total12Server failure "event" rule is not included (i.e. the handle that sets the isAlive bit is not counted as a rule). "Event" rules for setting the status of an RPC are also not included. The RPC status is considered a state field.

 

Hadoop MapReduce Walkthrough

...

Hadoop MapReduce State Machine Redundancy

StateMachineTotal TransitionsDistinct Transitions# Duplicate / # Distinct
JobImpl822750/7
TaskImpl24

16

6/3
TaskAttemptImpl571541/8
Total1635897/18

 

JobImpl
CountTransitionTrigger (Event/State)Comment
12DIAGNOSTIC_UPDATE_TRANSITION

JobEventType.JOB_DIAGNOSTIC_UPDATE

Same for most named states (JobImpl)
14COUNTER_UPDATE_TRANSITIONJobEventType.JOB_COUNTER_UPDATE 
13

INTERNAL_ERROR_TRANSITION

JobEventType.INTERNAL_ERROR

InternalErrorTransition extends InternalTerminationTransition by setting the error into the history string.

5INTERNAL_REBOOT_TRANSITIONJobEventType.JOB_AM_REBOOT

InternalRebootTransition extends InternalTerminationTransition by setting the error in the history string.

2

TASK_ATTEMPT_COMPLETED_EVENT_TRANSITION

JobEventType.JOB_TASK_ATTEMPT_COMPLETED

 
2

KilledDuringAbortTransition()

JobEventType.JOB_KILL

From FAIL_WAIT and FAIL_ABORT
2

JobAbortCompletedTransition()

JobEventType.JOB_ABORT_COMPLETEDFrom FAIL_ABORT and KILL_ABORT

...