Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The script scripts/recovery.py can be used to run recoveries for testing. Diego wrote the script and is most expert on it, but here are some simple instructions:

  • You should probably set up ssh master mode for each of the cluster nodes. Here is a shell script that you can run on rcmaster to do it:

    No Format
    
    #!/bin/sh
    #
    # This script sets up ssh master mode for all of the machines
    # in the RAMCloud cluster.
    
    if [ $(hostname) == "rcmaster.scs.stanford.edu" ]; then
        for host in rc{01..36}; do
            if [ -z "$(pgrep -u $USER -fx "ssh -fMN $host true")" ]; then
                ssh -fMN $host true 2>/dev/null &
            fi
        done
    fi
    

...

  • The log files for all of the servers involved in the recovery are placed in the directory recovery/latest. If you run more recoveries, recoverylogs/latest always refers to the most recent recovery, but log files from old recoveries are retained in other subdirectories of recovery.
  • After running a recovery, you can run scripts/metrics.py, which will examine the logs in recoverylogs/latest and produce summary information describing the recovery.