The script scripts/recovery.py
can be used to run recoveries for testing. Diego wrote the script and is most expert on it, but here are some simple instructions:
You should probably set up ssh master mode for each of the cluster nodes. Here is a shell script that you can run on rcmaster to do it:
No Format #!/bin/sh # # This script sets up ssh master mode for all of the machines # in the RAMCloud cluster. if [ $(hostname) == "rcmaster.scs.stanford.edu" ]; then for host in rc{01..36}; do if [ -z "$(pgrep -u $USER -fx "ssh -fMN $host true")" ]; then ssh -fMN $host true 2>/dev/null & fi done fi
...
- The log files for all of the servers involved in the recovery are placed in the directory
recovery/latest
. If you run more recoveries,recoverylogs/latest
always refers to the most recent recovery, but log files from old recoveries are retained in other subdirectories ofrecovery
.
- After running a recovery, you can run
scripts/metrics.py
, which will examine the logs inrecoverylogs/latest
and produce summary information describing the recovery.