Setting Up a RAMCloud Cluster

This page describes how to compile RAMCloud and set up a production cluster.

Downloading and Compiling RAMCloud

Create a read-only clone of the RAMCloud repository and compile RAMCloud, as described in General Information for Developers. At this point it probably makes sense to work with the most recent git commit, in order to get the latest features and bug fixes.

Configuring a Cluster

In order to run a production RAMCloud cluster, you will need to set up three kinds of servers:

  1. Storage servers. These implement most of the RAMCloud functionality. A RAMCloud server typically contains a master, which manages local DRAM to store portions of the RAMCloud key-value store, and a backup, which manages redundant information stored on disk or flash to recover from crashes. You must start up a storage server on each server whose memory will be incorporated into the RAMCloud cluster. A given server can only manage a single backup device; if you want to use multiple backup devices on a single node, you'll start one server with a master and backup, plus additional servers that only contain the backup component.
  2. Coordinator. At any given time there is one machine serving as cluster coordinator. This machine manages overall cluster configuration information (such as which storage servers handle which tablets), and it coordinates recovery when storage servers crash. You will typically start 2-3 instances of the coordinator. At a given time, only one of them is active, but if it crashes one of the others will immediately take over for it. You can run a cluster with only a single coordinator, but if this coordinator a crashes the cluster will be unavailable until you restart this coordinator or start a new one someplace else.
  3. External storage (ZooKeeper or LogCabin). The coordinator stores top-level cluster configuration information (such as the set of servers currently participating in the cluster) in an external storage system. RAMCloud can support different external storage systems; right now ZooKeeper and LogCabin are supported. Thus, you will need to start a ZooKeeper or LogCabin cluster in order to run RAMCloud. You can run a RAMCloud cluster without an external storage system, but if the coordinator crashes then all of the data stored in the cluster will be lost. Using LogCabin currently requires building with 'make LOGCABIN=yes', and you'll previously need the LogCabin submodule checked out ('git submodule update --init --recursive') and compiled ('make logcabin' or 'cd logcabin; scons').

Starting ZooKeeper or LogCabin

Read the ZooKeeper documentation or LogCabin documentation for information on how to do this. If you have an existing ZooKeeper/LogCabin cluster, RAMCloud should be able to share that cluster with other uses: RAMCloud stores all of its information underneath the /ramcloud znode.

Starting the Coordinator

In normal use you should probably start 2-3 instances of the coordinator on different machines. They will decide among themselves which one is initially active; the others will wait in standby mode until the active coordinator crashes. The coordinator does not typically use a lot of resources, so you can run it on the same nodes that run ZooKeeper/LogCabin, and you should be able to run a storage server on that node as well. If you have compiled RAMCloud in the standard fashion, the RAMCloud binaries will be in the subdirectory obj.master. Start the coordinator with a command line like this:

obj.master/coordinator -C infrc:host=`hostname -s`,port=11100 -x zk:rcmaster:2181

For production use, you should probably write a shell script that runs this command  and then immediately restarts the coordinator if it terminates for any reason. The -C argument gives the service locator for the coordinator. This indicates how other machines will communicate with the coordinator. In this example the coordinator will use Infiniband reliable connections for communication, and TCP port 11100 will be used to set up those connections. For details on service locators, see Service Locators. The -x argument tells the coordinator where it should store its configuration information; the "zk:" prefix indicates that ZooKeeper will be used for external storage (or "lc:" for LogCabin), and the remainder of the argument is a comma-separated list of ZooKeeper/LogCabin server addresses. If you omit this argument, the coordinator will not storage its configuration information externally, so we coordinator crash will cause all of the information in the cluster to be lost.

Run the coordinator with the --help option to get a complete list of command-line options. Here are a few of the most useful options:

--clusterNameIt is possible to have several RAMCloud clusters running at the same time, sharing the same ZooKeeper/LogCabin servers. This option indicates which cluster the current coordinator is associated with (it also determines where the cluster's configuration information is stored in ZooKeeper/LogCabin). The default is "main".
--logFileLog messages will be written to this file. If this argument is omitted, log messages will be written to standard output.
--resetCauses the coordinator to discard any existing configuration information in external storage, starting a new cluster from scratch. Any existing data for the current cluster will be lost. Use with extreme caution; also requires special handling on storage servers to drop all existing backup data.

Starting Storage Servers

Each storage server should be started with a command like the following:

obj.master/server -L infrc:host=`hostname -s`,port=1101 -x zk:rcmaster:2181 --totalMasterMemory 16000 -f /dev/sda2 --segmentFrames 10000 -r 2

As with the coordinator, in a production setting you should probably run each server with a shell script that restarts the server if it should terminate or crash. The meaning of the command-line switches is as follows:

-L infrc:host=`hostname -s`,port=1101
Service locator for this server: indicates how other machines should communicate with the server. See Service Locators for details.
-x zk:rcmaster:2181Has the same meaning as the corresponding coordinator argument: specifies the external storage server that is used for cluster configuration information. The storage server uses this information to locate the coordinator for the cluster.
--totalMasterMemory 16000
Total amount of  DRAM this storage server should use for RAMCloud data, in MBytes (16GB in this example). The storage data will consume additional memory beyond this for various metadata purposes, and you should ensure that server nodes never have to page, so the total amount of memory on the note should probably be 1-2GB larger than this value.
-f /dev/sda2

Specifies a file or raw device for the backup to use for storing backup data.

WARNING: file descriptor will be treated as a block device and unrelated existing data will be overwritten.

--segmentFrames 10000
The amount of space available on the backup device, specified in units of 8MB segments (in this example, the total storage available is 80GB). If a server uses M bytes of DRAM and the replication factor is R, then it should typically have about 2MR bytes of space in backup storage; any value lower than this may cause the cluster to eventually fail to service write requests.
-r 2Replication factor for RAMCloud data. In this example, 2 backup copies will be kept on secondary storage for each object in memory. A replication factor 2 is probably reasonably safe, given that there is also a copy in DRAM; 3 is conservative. All of the storage servers in the cluster should use the same replication factor.

Run the server with the –help option to get a complete list of command-line options; here are a few other options that are occasionally useful:

--backupOnlyIf this option is specified, the server will act as a backup only. This option is used to incorporate additional backup devices from single node. In this case, options such as -r and --totalMasterMemory are irrelevant and can be omitted.
--clusterNameIf multiple RAMCloud clusters exist at the same time, this indicates which cluster this server should  be part of. Defaults to "main".
Log messages will be written to this file. If this argument is omitted, log messages will be written to standard output.

Allow replication to local backup (e.g. data and backup can be on same server).