Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 240 Next »

What is RAMCloud?

The RAMCloud project is creating a new class of storage, based entirely in DRAM, that is 2-3 orders of magnitude faster than existing storage systems. If successful, it will enable new applications that manipulate large-scale datasets much more intensively than has ever been possible before. In addition, we think RAMCloud, or something like it, will become the primary storage system for cloud computing environments such as Amazon's AWS and Microsoft's Azure.

The role of DRAM in storage systems has been increasing rapidly in recent years, driven by the needs of large-scale Web applications. These applications manipulate very large datasets with an intensity that cannot be satisfied by disks alone. As a result, applications are keeping more and more of their data in DRAM. For example, large-scale caching systems such as memcached are being widely used (in 2009 Facebook used a total of 150 TB of DRAM in memcached and other caches for a database containing 200 TB of disk storage), and the major Web search engines now keep their search indexes entirely in DRAM.

Although DRAM's role is increasing, it still tends to be used in limited or specialized ways. In most cases DRAM is just a cache for some other storage system such as a database; in other cases (such as search indexes) DRAM is managed in an application-specific fashion. It is difficult for developers to use DRAM effectively in their applications; for example, the application must manage consistency between caches and the backing storage. In addition, cache misses and backing store overheads make it difficult to capture DRAM's full performance potential.

Our goal for RAMCloud is to create a general-purpose storage system that makes it easy for developers to harness the full performance potential of large-scale DRAM storage. It keeps all data in DRAM all the time, so there are no cache misses. RAMCloud storage is durable and available, so developers need not manage a separate backing store. RAMCloud is designed to scale to thousands of servers and hundreds of terabytes of data while providing uniform low-latency access to all machines within a large datacenter.

As of Fall 2011, we had initial implementations of many of the components of RAMCloud and the system runs well enough to use it for simple tests. On our 60-node test cluster we are able to perform remote reads of 100-byte objects in about 5 microseconds, and an individual server can process more than 800,000 small read requests per second. The basic crash recovery mechanism is running, and RAMCloud can recover 35 GB of memory from a failed server in about 1.6 seconds.

The RAMCloud project is still young, so there are many interesting research issues still to explore, such as the following:

  • Data model. RAMCloud currently supports a very simple data model (key-value store); we would like to see if we can provide higher-level features such as secondary indexes and multi-object transactions while without sacrificing the scalability or performance of the system.
  • Consistency: we believe that RAMCloud can provide strong consistency (serializability) without sacrificing performance, but there are several interesting problems to solve in order to achieve that.
  • Cluster management: what are the right mechanisms and policies for reorganizing RAMCloud data in response to changes in the amount of data and the access patterns?
  • Network protocols: we don't think that TCP is the right protocol to provide highest performance within a datacenter, so there is an interesting research project to investigate what is the ideal protocol.
  • Multi-tenancy: how to support multiple independent, perhaps hostile, applications sharing the same RAMCloud storage system within a large datacenter? This introduces issues related to access control and also potentially issues of performance isolation.
  • Multiple datacenters: our current design for RAMCloud focuses on a single datacenter, but some applications will require redundancy across datacenters in order to protect against datacenter failures. An interesting question is whether we can provide that level of redundancy without dramatically impacting the performance of the system.

Introduction to RAMCloud

Resources

Development

RAMCloud Cluster

Informational

Current work

Old Topics

Miscellaneous Topics

Personal Wikis

  • No labels