Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Cluster Custodian

If you notice a down machine that doesn't seem to respond to rcreboot ping the IRC user listed for the current week listed below.

  • 7/22/12: stutsman
  • 7/29/12: ankitak
  • 8/5/12: ongardie
  • 8/12/12: daeschli
  • 8/19/12: mendel
  • 8/26/12: ouster
  • 9/2/12: syang0
  • 9/9/12: satoshi

The current custodian is responsible for restarting, debugging, and reimaging machines and generally keeping the cluster working.

Crashes

This page logs instances of dead machines; we are using it to track down the mysterious machine crashes that occurred starting in August 2011. 

  • June 15: rc37 (failed to reboot), rc38 (failed to reboot), rc79

...