New Small Project Ideas
- Backups should be able to use multiple disks at the same time. This shouldn't be too hard to do, and it is probably even reasonably fun to do. A MultiFileStorage could instantiate multiple SingleFileStorages and load balance across them. Shouldn't be too hard since the interface for the storage class isn't too wide.
CURIS 2012 Proposals
CURIS (curis.stanford.edu) is a CS undergraduate summer research program. Below are some RAMCloud proposals we may submit for 2012.
Proposals are apparently around 2 paragraphs. We are permitted to submit multiple separate proposals, each of which can be taken up by one or more students. Different proposals will feature a generic RAMCloud paragraph followed by a more specific description of the project(s) in each proposal.
The RAMCloud project is creating a new class of high-speed storage for datacenters, where all data is kept in DRAM at all times. RAMCloud is a software system that aggregates the DRAM of thousands of servers into a single large-scale and extremely fast storage system (small objects can be read from any server in the same datacenter in 5-10 microseconds, which is 100-1000x faster than today's disk-based storage systems). The end goal is to make exciting new applications possible by pushing the boundaries of scale and latency in datacenter storage systems. This is a large, open-source project headed by Professors John Ousterhout and Mendel Rosenblum, and there are four full-time graduate students currently working on various aspects of the system. We are committed to making RAMCloud a robust, production-ready system, rather than just a research prototype. We currently build and test RAMCloud on an 80-node, 320-core Linux cluster with an aggregate of nearly 2 TB of main memory and 20 TB of flash storage, all connected by a high-performance Infiniband network.
Proposal 1: Web Dashboard for Cluster Monitoring & Management
Proposal 2: RAMCloud Core Systems Development
RAMCloud currently consists of about 75,000 lines of C++, docs, and unit tests, but it is far from complete. In this project students will work on one or more aspects of the RAMCloud implementation, such as the following: making more pieces of the system multithreaded to increase server throughput; expanding the simple read/write operations with additional operators (such as atomic increment); improving the crash-recovery system, which recovers so quickly after server crashes that most users don't even notice the crash; or implementing additional operations to support the development of a Web-based dashboard for RAMCloud. For this project prior experience with C++ is highly desirable; however, students skilled in C and familiar with other object-oriented languages like Java should be able to learn C++ on the job. CS 140 and CS 144 provide good background for students interested in this project, though they are not essential.
For more information about RAMCloud, you can refer to: