Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 17 Next »

backup

Each server in the RAMCloud system serves two roles: it is master for some objects (it stores those objects in its DRAM and handles reads and writes for the object), and it also serves as backup for other objects. A backup server is responsible for storing information on disk as directed by masters, and retrieving that information from disk during crash recovery. Each object in RAMCloud is typically backed up on several machines; each master divides its data among many different backup machines; and each backup records information for many masters.

client machine

A machine running one or more applications that use the RAMCloud storage system. The applications normally use a client library package to communicate with the RAMCloud servers. Client machines cannot necessarily be trusted by the RAMCloud servers, and the RAMCloud system does not depend on any particular behavior of client machines.

client library

A collection of functions used by applications to access the RAMCloud storage system. The client library may include significant functionality that extends the base functions provided by the RAMCloud servers. For example, the client library will probably understand the contents of stored objects, whereas the servers treat the objects as opaque blobs. It is possible for there to be different client libraries that implement different abstractions on top of the base RAMCloud features; examples might be a memcached API, a full relational model, or a file system API.

coordinator

A distinguished server, which manages the other servers in the RAMCloud cluster. Some of the coordinator functions are:

  • The coordinator manages a list of all active servers in the cluster.
  • The coordinator contains configuration information describing which servers contain which tablets; client machines retrieved this information to manage their own caches of configuration information.
  • The coordinator manages access control information, which it makes available to other servers in the cluster.
  • The coordinator is responsible for deciding that a server has crashed and initiating recovery of that server.
  • The coordinator is responsible for moving data between servers in the cluster in order to balance load.

first-stage recovery

The period of time immediately after a server crash, during which that server's data is unavailable. During this stage of recovery backups read data from their disks into memory, and one or more replacement servers retrieve enough data from the backups to resume system operation.

index

Used to provide efficient forms of lookup on data within tables. Each table may have any number of indexes associated with it; each index maps from keys in some form (strings, numbers, timestamps, etc.) to a set of objects within the table. Indexes support range lookups as well as exact matches.

key

A 64-bit identifier that names an object within its table. By default keys are assigned sequentially for each table by the RAMCloud system starting at 1 and are never reused; however, applications can choose keys explicitly if they wish, in which case they may also be reused.

log

Used by a master to hold object data. Each log consists of an ordered collection of segments. Logs are used in an append-only fashion: the contents of a segment are never modified once written. A single master may potentially have multiple logs (e.g., different logs might correspond to different degrees of replication).

master

Each object lives in the DRAM of a particular server, which has primary responsibility for managing the object. That server is called the master for the object. All reads and writes of an object must be directed to the master server for that object. Each server in the RAMCloud system is master for many different objects.

object

The basic unit of data stored in the RAMCloud system. Each object is named with a key that uniquely identifies the object within its table. Objects are variable-length up to a limit of 1MB, and the RAMCloud servers do not interpret the contents of objects: they are just opqaue blobs of data. Each object has a 64-bit version number that increases monotonically whenever the object is modified.

second-stage recovery

The second part of the time required to recover from a crashed server. During this stage the crashed server's data is available so the system can continue servicing requests, but the recovered data may be spread among multiple backup servers, resulting in higher overheads for accessing the data. During the second stage of recovery the system reorganizes the recovered data so that it is no longer occupying resources on the backup servers. After the second stage of recovery completes, the system is back to normal operation.

segment

A fixed-size portion of a log (currently 8 MBytes). Segments are the unit of backup: each segment exists in the memory of its master and is also replicated on one or more backups. Different segments within a log are typically backed up on different servers. The segment size is chosen so that full-segment writes to disk utilize 90% or more of the maximum disk bandwidth. The segment is also the unit of log cleaning: when most of the data in a segment has been deleted the master can copy the remaining live data to another segment and delete the old segment.

server

One of the machines implementing the RAMCloud storage system. Server machines are "owned" by RAMCloud: they only execute trusted RAMCloud code. Server machines execute RPC requests coming from clients, and also communicate among each other to manage the RAMCloud system.

shard

In order to speed up recovery, each master spreads its data across multiple backups; during recovery, the backups can all retrieve their respective portions of the data in parallel. The portion of a master's data that is assigned to a single backup is called a shard. A master's data will typically divide into hundreds or thousands of shards; furthermore, each shard is typically backed up on more than one machine, to provide safety against multiple crashes. Shards are divided into segments, which are in turn divided into blocks.

table

Used to group related objects and to separate data from different applications. Objects are named using a table identifier and a key within the table. Access control information is based on tables, and indexes are associated with particular tables.

version number

An integer value associated with each object, which starts at 0 when the object is created and is incremented every time the object is modified. Used to implement atomic operations on the object.

  • No labels