Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

WIP distillation of thoughts on DCFT.

Python MapReduce Notes

Interesting that my implementation also had three levels: Master/Scheduler (Job), TaskWrapper (Task), and Task (TaskAttempt).  I used a "nested rules" approach but in reality it is likely that it could have been collapsed in one top level rules set.  I should mention there is a great deal of similarity between a MapTask and a ReduceTask in terms of it's rules.  How do you modularize Task classes?  Can you subclass a Task class?  The below table does not include the RPC rules.  The table also does not account for the rules for the membership service (though that likely should be a different module).

ModuleRule CountComments
Master/Scheduler1Basically just the rule to prevent deadlock by preempting tasks
MapTaskSet2For straggler reissue.
MapTask/Wapper5 
ReduceTaskSet0Just pool management.
ReduceTask4 
Total12Server failure "event" rule is not included (i.e. the handle that sets the isAlive bit is not counted as a rule). "Event" rules for setting the status of an RPC are also not included. The RPC status is considered a state field.

 

Hadoop MapReduce Walkthrough

...