Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Cluster Custodian

If you notice a down machine that doesn't seem to respond to rcreboot ping the IRC user listed for the current week listed below.

  • Jonathan Ellithorpe – start: 11/07/2013
  • Arjun Gopalan
  • Ashish Gupta
  • Collin Lee
  • Behnam Montazeri
  • Henry Qin
  • Ankita Kejriwal
  • Diego Ongaro
  • Ryan Stutsman

The current custodian is responsible for restarting, debugging, and reimaging machines and generally keeping the cluster working. Also, the outgoing custodian is responsible for notifying the next week's custodian that it is their turn.

Crashes

This page logs instances of dead machines in reverse chronological order; among other things, we are using it to track down the mysterious machine crashes that occurred starting in August 2011.

...