Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Here's a collection of terms we're starting to use broadly.

  • Table - A scoping of keys within the system. All objects live in one table and all indices are associated with one table.
  • Primary Key - The system-generated key that uniquely identifies an object within a table.
  • Object ID - A (key, version) tuple.
  • Index - A lookup structure for arbitrary keys to object identifiers. Can be range-queryable.
  • Key - A typed index into an index structure (types may include string, int, float, etc).
  • Version Number - A number associated with a primary key that is system-incremented when an object is modified.
  • Shard - A subset of a table's key space. Shards are sized for efficient disk access (i.e. they can be sucked into memory from a disk within a small amount of time on failure).
  • Shardlet - Shards are broken up into shardlets, which are akin to LFS segments. Each shardlet is written to disk sequentially and sized such that they can be efficiently accessed in light of disk seek times.
  • Block - A shardlet is composed of multiple fixed sized blocks. Blocks are the main unit of memory allocation in the system. They store one or more objects or portions of objects and are mapped into shardlets by an explicit table structure (inode).
  • Inode - Each object is mapped to one or more blocks by an inode mapping. An object may start partway through the first block and end partway through a last block. In either case, space may be shared with another object. An inode contains metadata including an object checksum, version and index name, key tuples.

Needs merging with:

This page is an attempt to create an official, unambiguous set of terms to be used throughout RAMCloud discussions, texts, and code. These terms are listed here to minimize the confusion and overhead that goes along with constantly redefining vocabulary, but of course these terms and definitions should evolve over time.

The glossary is split up into sections to make it easier to find terms. The partitions don't play out entirely cleanly, though, and (circular) references are all over the place. Maybe alphabetically with no partitions would be better...ideas?

Machine Types

Client

A client machine has a library that queries servers on behalf of an application.

Server

A server machine services queries from clients.

Master Server

A master server is in charge of handling requests for a set of shards. A master server is usually also a backup server for another set of shards.

Backup Server

A backup server is in charge of backing up a set of shards, usually from various master servers. A backup server is usually also a master server for another set of shards.

Master

The master for an object is the master server for the shard which contains the object.

Data Types

Blob

A blob is some opaque, binary data stored in the system.

Object

An object is an identifiable container for a blob. It is identified by a table and a primary key. The object contains a checksum for the blob to verify its integrity. The object also keeps a running version number, referring to the revision of the blob stored.

Table

A table is a logical grouping of objects which share a set of indexes.

Primary key

A primary key is a system-assigned 64-bit integer (which will never be reused?). Together with a table, it identifies an object.

Version Number

A version_number of an object is an integer that refers to the revision of its blob. It is used for the overwrite request, which asserts the previous version number as a parameter: if the server finds the given version number is out of date, the overwrite request will be aborted.

Shard

A shard is a set of objects corresponding to a contiguous region of primary keys of a table. An object is a member of exactly one shard.

Block

A block is a fixed-size memory allocation unit on a server, and much of a server's memory will be treated as an array of blocks. Blobs are stored in a set of blocks. While the blob need not start or end on block boundaries, it must fully utilize any blocks in the middle.

Inode

An inode stores an object. This includes the blob's checksum, its version, and all of its index entries. It also includes the list of blocks for the its blob, the start offset, and the length.

Shardlet

A shardlet is an array of constant size of blocks. A shardlet will likely have a maximum size that is convenient to write to disk, as a backup server will write a shardlet at a time to disk. A shardlet on a master server will be an array of pointers to blocks, while a shardlet on a backup server will probably be an array of blocks (directly).

Index

An index, made up of index entries, maps index keys to objects.

Index Key

An index key is a short, application-controlled string used in range queries on indexes.

Index Entries

An index entry is an (index key, object) pair in a given index.

Request Types

Get Request

The get request asks the server to return the object for a given table and primary key.

Overwrite Request

The overwrite request asks the server to replace the blob for a given table and primary key at a specific version number with a new blob.

Insert Request

The insert request asks the server to create a new object for a given table with the given blob and return the newly allocated primary key.

Put Request

The put request first runs the overwrite request. If no such object exists, it then runs the insert request. This happens atomically on the server.

Index Lookup Query

The index lookup query asks a server(s?) for objects matching a certain index key value in an index.

Range Query

The range query asks a server(s?) for objects matching certain index key ranges in an index.

  • No labels