Page Comparison

...

RAMCloud is also interesting from a research standpoint. Its two most important attributes are latency and scale. The first goal is to provide the lowest possible end-to-end latency for applications accessing the system from within the same datacenter. We currently achieve latencies of around 5μs for reads and 15μs for writes, but hope to improve these in the future. In addition, the system must scale, since no single machine can store enough DRAM to meet the needs of large-scale applications. We have designed RAMCloud to support at least 10,000 storage servers; the system must automatically manage all the information across the servers, so that clients do not need to deal with any distributed systems issues. The combination of latency and scale has created a large number of interesting research issues, such as how to ensure data durability without sacrificing the latency of reads and writes, how to take advantage of the scale of the system to recover very quickly after crashes, how to manage storage in DRAM, and how to provide higher-level features such as secondary indexes and multiple-object transactions without sacrificing the latency or scalability of the system. Our solutions to these problems are described in a series of technical papers.

The RAMCloud project is was based in the Department of Computer Science at Stanford University. The project is no longer active and the students working on RAMCloud have graduated, so we cannot provide support for anyone wishing to use RAMCloud.

Learning About RAMCloud

General information about RAMCloud, such as talks and papers. Much of the information here is related to the research aspects of the project, as opposed to information on how to use RAMCloud.

Introductory talk on RAMCloud by John Ousterhout, given at LinkedIn on October 12, 2011.
The RAMCloud Storage System: a comprehensive paper describing RAMCloud, including the log-structured storage mechanism, RAMCloud's thread architecture and approach to low latency, and its crash recovery mechanisms. Published in ACM TOCS in September 2015.
The Case for RAMCloud: an early position paper that discusses the motivation for RAMCloud, the new kinds of applications it may enable, and some of the research issues that will have to be addressed to create a working system. Appeared in CACM in July 2011.
An earlier and slightly longer version of the position paper, which appeared in Operating Systems Review in December 2009.
Fast Recovery in RAMCloud: describes RAMCloud's mechanism for recovering crashed servers in 1-2 seconds. Appeared in SOSP in October, 2011
Log-Structured Memory for DRAM-based Storage: describes how RAMCloud manages the storage of objects both in DRAM and on disk. Appeared in FAST in February, 2014; won Best Paper Award.
Toward Common Patterns for Distributed, Concurrent, Fault-Tolerant Code: HotOS 2013 workshop paper describing a rules-based approach for building "DCFT" systems.
Articles about RAMCloud (Web and print media, written by people outside the RAMCloud group)
RAMCloud Papers (complete listing of all papers written by the RAMCloud group)
RAMCloud Presentations (Slides from talks about RAMCloud)
Glossary of RAMCloud Terms

How to Deploy and Use RAMCloud

...

Measurements of RAMCloud performance, as well as comparisons between RAMCloud and other systems.

clusterperf benchmarks (benchmarks run on a cluster to measure basic things such as read and write latency and throughput)
How To Run Clusterperf
Perf benchmarks (microbenchmarks measuring various low-level operations on a single machine, such as atomic increment)
Performance Improvement Log
Recovery Performance Benchmark
Latency Patterns in Infiniband (talk by Alex ModkovichMordkovich, May 2012)
RPC Latency Profile (the lifetime of a write operation, measured January 2012)
SSD Experiments (July 2011)
Redis vs. RAMCloud
Older Performance Measurements

...

General Information for Developers (how to get started as a RAMCloud developer)
Build System Structure
RAMCloud Tech Talks (Videos of RAMCloud developers describing the internals of various system components)
Want to Contribute to RAMCloud? (notes for people who would like to contribute code to RAMCloud)
Running Recoveries with recovery.py
Coding Conventions
Style Guide
Documentation Guidelines
Writing Unit Tests
Amendments to Current Documentation and Testing Guidelines
Software Design Philosophy – John Ousterhout's pet peeves
How To Measure Performance: John's pet peeves (and ideas for a possible paper)
RAMCloud C Style for EMACS
Vim Settings
Copyright Notice
Mfence – x86 instructions for limiting instruction reordering
Inside Concurrency Primitives
Wireshark Plugin DallyFastNetwork.pdf
NetBeans IDE tips
Measuring RAMCloud Performance
Code review tool
Phabricator code review tool
Git repo: see General Information for Developers
IRC channel: #ramcloud on freenode.
- - See rcres for coordinating usage of RAMCloud cluster. This is used to coordinate usage of the RAMCloud cluster. Anytime you are using the cluster you should be listening on this channel; if you don't respond to comments on the channel, your jobs may be killed.
  - Transcripts of this channel may be found here
RAMCloud Cluster Resource manager (rcres) : rcres is a shell command available on the "rcmaster" machine of the RAMCloud cluster. Any time you are using the cluster you should ensure that you lease the machines you are using using rcres.
Dumpstr tool for viewing reports (mostly performance data)
Documentation, generated nightly from the source code

...

Cluster Intro – information about our cluster for newcomers
New Contributor Checklist (how to set up access for new team members)
/wiki/spaces/RAM/pages/6848593 – for sysadmins
Cluster Custodian - rotatiing responsibility for managing the cluster and providing technical support
Cluster Issues - central location for keeping track of problems in the cluster
Cluster Inventory - includes notes about cluster setup and spare components
Intel 530 Performance recent performance issues with Intel 530 SSDs
SSD Latency Experiments - Performance measurements of our cluster's SSDs (2016)
Cluster Tasks - (not so) recent issues with cluster machines
Machine Evaluations
Compiling RAMCloud on CentOS
Tips from Charlie & Co
Reimaging a Cluster Machine
Installing New Software on the Cluster
Controlling Machines Remotely via IPMI
Updating BIOS automatically with PXE and FreeDOS
Infiniband Tools and Debugging
Updating Mellanox NIC Firmware (to eliminate limit on timeouts)
Dead Machines
New Infiniband Fabric Notes
Mellanox HW and Infiniband Notes
/wiki/spaces/RAM/pages/6848654 (for BIOS and boot-time configuration)

...

Distributed Systems Reading Group
Team Members
Group Photos
Lunch Ideas
Current Applications (applications that are using RAMCloud or considering it)
SEDCL/PlatformLab Retreat - Industrial Feedback
Server Prices: sample server configurations and prices
Memory Prices
Interesting Statistics
Old Miscellaneous Topics
New Cluster Wishlist

Versions Compared

Old Version 323

New Version Current

Key

Learning About RAMCloud

How to Deploy and Use RAMCloud

Personal Wikis