Running Recoveries with recovery.py
The script scripts/recovery.py
can be used to run recoveries for testing.
You should probably set up ssh master mode for each of the cluster nodes. Here is a shell script that you can run on rcmaster to do it:
#!/bin/sh # # This script sets up ssh master mode for all of the machines # in the RAMCloud cluster. if [ $(hostname) == "rcmaster.scs.stanford.edu" ]; then for host in rc{01..80}; do if [ -z "$(pgrep -u $USER -fx "ssh -fMN $host true")" ]; then ssh -fMN $host true 2>/dev/null & fi done fi
- From a RAMCloud directory in which you have compiled the system, invoke
scripts/recovery.py
.
- This will run a recovery with a default configuration (currently as many masters and backups as the cluster can support). To try recoveries with different configurations, change the arguments passed to the
recover
method, which are specified at the very end ofscripts/recovery.py
.
- The log files for all of the servers involved in the recovery are placed in the directory
recovery/latest
. If you run more recoveries,logs/latest
always refers to the most recent recovery, but log files from old recoveries are retained in other subdirectories ofrecovery
.
- After running a recovery, you can run
scripts/recoverymetrics.py
, which will examine the logs inlogs/latest
and produce summary information describing the recovery.