Project History
Here are some interesting dates in the history of the RAMCloud project (reverse chronological order):
- January 9, 2014: RAMCloud version 1.0 officially tagged.
- November 2013: first recovery from a coordinator crash using the new ExternalStorage mechanism with ZooKeeper.
- Spring 2013: First recovery from a coordinator crash (using LogCabin with the logging approach).
- September 17, 2012: First successful recovery from the loss and restart of all master+backup processes (without a crash of the coordinator).
- September 5, 2012: First successful coordinator recovery (tested server enlistments interleaved with coordinator crashes).
- March 2012: RAMCloud converts from fixed-size 64-bit keys to variable-length-byte-array keys.
- February 18, 2012: First end-to-end recovery of a failed backup server.
- Summer 2011: major revisions of log cleaner; it's now almost production-ready.
- Summer 2011: servers now support multi-threading, and the RPC system is capable of detecting and reporting timeouts.
- Spring 2011: Ankita Kejriwal takes a first-year rotation with the project; she joins the team in Fall 2011.
- March 18, 2011: SOSP paper submitted.
- March, 2011: Nandu's measurements show that RAMCloud is achieving 5μs RPCs for 100-byte reads; a single server can handle more than 1M RPCs/second. These measurements were taken with the 40-node cluster using Mellanox Infiniband networking.
- February 24, 2011: recovery works on cluster (but many performance issues still to resolve).
- February 23, 2011: 40-node cluster arrives (only 3.5 weeks before SOSP deadline).
- January 2011: Nandu Jayakumar joins project.
- December 2010: 40-node cluster ordered.
- November 8: first successful recovery (TcpTransport, 2 backups, 2 masters, 1 segment, 1 partition)
- Fall 2010: focus shifts to fast recovery: can we implement and evaluate this in time for SOSP paper submission deadline of March 18, 2011?
- July 2010: white paper accepted for publication in CACM.
- Summer 2010: implementation gets underway in earnest. The basic RPC system comes to life with several different transports, including first implementation of FastTransport.
- Spring 2010: initial work on the RPC transport system (e.g. first implementation of Buffer class). Ryan and Steve are still busy studying for quals. Aravind graduates.
- April 1, 2010: all-day RAMCloud design review, which includes external reviewers from Berkeley (Michael Armbrust, Mike Franklin, and Tim Kraska) Facebook (Keith Adams and Bob English), Google (Jeff Dean, Sanjay Ghemawat, Luiz Barroso), Hewlett-Packard (Jeff Mogul and Partha Ranganathan), Microsoft (Marcos Aguilera and Jim Larus), NEC (Yoshiki Seo and Masamichi Takagi), NetApp (Tim Emami and Shankar Pasupathy), and SAP (Shel Finkelstein).
- Winter 2010: Diego creates a file system on top of the trivial RAMCloud server and also shows that transactions can be implemented at application level. More work on the design (preparation for April design review), but not much progress on the main system implementation: Ryan and Steve are mostly occupied finishing up their old project (Cinder) and studying for quals.
- Late Fall 2009: a trivial RAMCloud server responds to basic read and write and requests.
- Fall 2009: the RAMCloud implementation team is formed and the group begins to flesh out the design in detail. The initial students are Aravind Narayanan, Steve Rumble, and Ryan Stutsman; participating faculty are Christos Kozyrakis, David Mazieres, John Ousterhout, and Mendel Rosenblum. New PhD student Diego Ongaro starts attending group meetings in the fall, and officially joins the project at the beginning of 2010.
- Spring 2009: a discussion group meets weekly to explore design issues for RAMCloud. The results are published as the white paper The Case for RAMCloud, with all of the discussion group members as authors.
- March 2009: during the systems dinner for prospective PhD students, John strikes up a conversation about RAMCloud with Ryan Stutsman (then a second-year student); Steve Rumble (first-year student) listens intently from across the room, then moves over to join the discussion. Diego Ongaro (a prospective PhD student) is in the room but doesn't participate in the conversation.
- Fall 2008, Winter 2009: Various informal discussions about the RAMCloud concept between John Ousterhout, Mendel Rosenblum, Guru Parulkar, Balaji Prabhakar, David Erickson, and others. John's initial goal is to support 1μs RPCs, but he backs off to 5-10μs after Balaji explains the state of current datacenter switching fabrics.
- Fall 2008: John Ousterhout suggests the basic idea for RAMCloud to Mendel Rosenblum over lunch.