Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In normal use you should probably start 2-3 instances of the coordinator on different machines. They will decide among themselves which one is initially active; the others will wait in standby mode until the active coordinator crashes. The coordinator does not typically use a lot of resources, so you can run it on the same nodes that run ZooKeeper, and you should be able to run a storage server on that note node as well. If you have compiled RAMCloud in typical the standard fashion, the RAMCloud binaries will be in the subdirectory obj.master. Start the coordinator with a command line like this:

...

For production use, you should probably write a shell script that runs this command  and then immediately restarts the coordinator if it terminates for any reason. The -C argument gives the service locator for the coordinator. This indicates how other machines will communicate with the coordinator. In this example the coordinator will use Infiniband reliable connections for communication, and TCP port 11100 will be used to set up those connections. For details on service locators, see Service Locators. The -x argument tells the coordinator where it should store its configuration information; the "zk:" prefix indicates that ZooKeeper will be used for external storage, and the remainder of the argument is a comma-separated list of ZooKeeper server addresses. If you omit this argument, the coordinator will not storage its configuration information externally, so we coordinator crash will cause all of the information in the cluster to be lost.

Run the coordinator with the --help option to get a complete list of command-line options. Here are a few of the most useful options:

...

resetWill cause
 --clusterNameIt is possible to have several RAMCloud clusters running at the same time, sharing the same ZooKeeper servers. This option indicates which cluster the current coordinator is associated with (it also determines where the cluster's configuration information is stored in ZooKeeper). The default is "main".
--logFileLog messages will be written to this file. If this argument is omitted, log messages will be written to standard output.
--resetCauses the coordinator to discard any existing configuration information in external storage, starting a new cluster from scratch. Any existing data for the current cluster will be lost. Use only with extreme caution; also requires special handling on storage servers to drop all existing backup data.

Starting Storage Servers

Each storage server should be started with a command like the following:

No Format
obj.master/server -L infrc:host=`hostname -s`,port=1101 -x zk:rcmaster:2181 --totalMasterMemory 16000 -f /dev/sda2 --segmentFrames 10000 -r 2

As with the coordinator, in a production setting you should probably run each server with a shell script that restarts the server if it should terminate or crash. The meaning of the command-line switches is as follows:

-L infrc:host=`hostname -s`,port=1101
Service locator for this server: indicates how other machines should communicate with the server. See ServiceLocators for details.
-x zk:rcmaster:2181Has the same meaning as the corresponding coordinator argument: specifies the external storage server that is used for cluster configuration information. The storage server uses this information to locate the coordinator for the cluster.
--totalMasterMemory 16000
Total amount of  DRAM this storage server should use for RAMCloud data, in MBytes (16GB in this example). The storage data will consume additional memory beyond this for various metadata purposes, and you should ensure that server nodes never have to page, so the total amount of memory on the note should probably be 1-2GB larger than this value.
-f /dev/sda2
Specifies a file or raw device for the backup to use for storing backup data.
--segmentFrames 10000
The amount of space available on the backup device, specified in units of 8MB segments (in this example, the total storage available is 80GB). If a server uses M gigabytes of DRAM and the replication factor is R, then it should typically have about 2MR bytes of space in backup storage.
-r 2Replication factor for RAMCloud data. In this example, 2 backup copies will be kept on secondary storage for each object in memory. A replication factor 2 is probably reasonably safe, given that there is also a copy in DRAM; 3 is conservative. All of the storage servers in the cluster should use the same replication factor.

Run the server with the –help option to get a complete list of command-line options; here are a few other options that are occasionally useful:

 
--backupOnlyIf this option is specified, the server will act as a backup only. This option is used to incorporate additional backup devices from single node.  In this case, options such as -r and --totalMasterMemory are irrelevant and can be omitted.
--clusterNameIf multiple RAMCloud clusters exist at the same time, this indicates which cluster this server should  be part of. Defaults to "main".
--logFileLog messages will be written to this file. If this argument is omitted, log messages will be written to standard output.