How To Run Clusterperf
Clusterperf is a program that we use for running performance benchmarks using a RAMCloud cluster (as opposed to Perf, which is a program that runs simpler micro-benchmarks on a single machine, without a full RAMCloud cluster). Here are notes on how to use Clusterperf.
- The file
scripts/clusterperf.py
is the top-level driver that runs the performance benchmarks. It usesscripts/cluster.py
to set up a RAMCloud cluster, then invokes the C++ programClusterPerf
on client machines to run benchmarks. - You must first have compiled
ClusterPerf
("make all
" will do that) and setup passwordless SSH between your machines. - Clusterperf assumes that you have used rcres to reserve a collection of hosts and it will use these hosts to run the benchmark (you should typically reserve 10-20 hosts, depending on the benchmark(s) you run). If you are not using rcres to reserve hosts or wish to use a smaller subset of your reservation, then you must create a file
scripts/localconfig.py
that describes the machines available in your cluster. Below is a sample file:
# This file customizes the cluster configuration for John Ousterhout. # It is automatically included by config.py. hosts = [] for i in range(41,54): hosts.append(('rc%02d' % i, '192.168.1.%d' % (100 + i), i)) old_master_host = ('rc40', '192.168.1.140', 20)
- Each entry in the
hosts
array describes one machine available for running either a RAMCloud server, a client, or both. The entry contains 3 values: a host name suitable for use withssh
to run commands on that host, the host's IP address (needed to create some service locators), and an ID that is used to generate Ethernet addresses for some transports. The third element is only used when running with raw Ethernet transports; you probably don't need to set this element. The variableold_master_host
is not needed by clusterperf, but it is used by other scripts such as scripts/recovery.py; it specifies a machine (not inhosts
) to use during recovery tests as the master server that will be crashed and recovered. - To run basic cluster tests, invoke the following command:
scripts/clusterperf.py basic
This will run the test namedbasic
, which measures read and write latency and throughput using a single client and a single server. - Clusterperf contains about a dozen performance tests. If you invoke
scripts/clusterperf.py
with no arguments, it will run all of the tests. To find out more about the tests, look in the filesrc/ClusterPerf.cc
. There is one method in this file with the same name as each test, and the comments at the top of that method give a little bit of information about how the test works. The complete list of tests is defined by the tabletests
, which is declared just before themain
function. - To get a complete listing of command-line options, invoke
scripts/clusterbasic.py --help
. - Clusterperf uses the file
scripts/cluster.py
to start up the RAMCloud cluster; if you have trouble running clusterperf, you may need to look into this script to see how things work. In addition, the filescripts/config.py
contains many other configuration options. In our environment at Stanford we don't typically have to customize the information in this file, but if your environment is different from ours, then you may need to change things in this file as well.
Defining new clusterperf benchmarks
If you'd like to create a new benchmark in the clusterperf suite, you must modify both scripts/clusterperf.py
and src/ClusterPerf
. There are comments at the beginning of each of these files describing what you must do to add a new test. In addition, take a look at how existing tests work in order to get ideas for how to create new tests.