Applications

Applications

Fundamental advantages of RAMCloud for an application

  • Any data access other than local RAM will be faster with RAMCloud (even ~75-85us with SSD)

  • Enables very high rates of queries, at least an order of magnitude faster than MySQL per box

    • People have reported O(20,000) for MySQL, probably really basic cached queries

    • More complex will likely be significantly slower, and if anything has to come from disk your of course hosed

  • As a consequence of the above, app writers can feel free to write dependent sequential queries if needed and it will still be fast

    • A con to this is to get similar functionality to today you may actually have to write sequential dependent queries

  • _Potentially_ no need to name every one of your queries and deal with synchronization when interacting with a cache layer like Memcached

  • No performance dependency on locality

  • Persistence when compared to memcached

  • Easy scalability

  • Less complexity than MySQL + Memcached

Lets consider some application categories that run in a datacenter/cloud setting:

  • Synthesizing Hardware (Cisco, Nvidia)

    • Cpu bound?

    • Not much disk access?

    • Could it page memory to ramcloud?

  • Rendering (Pixar, ILM, Disney)

    • Cpu bound?

    • Dataset may be too large

    • Possible speedup for grabbing textures?

    • Would this help a client machine manipulating the scene?

  • Simulation (weather, nuclear)

    • Cpu bound?

    • Any need for shared storage?

  • Transactional (stock exchange, banks, credit card processing)

    • Must be a 'to disk' component here including sync, could be a win

    • Online component here, ie fraud detection

  • MapReduce, batch processing

    • Could be interesting, depending on the dataset size

    • Is this an existing Pain Point?

    • Could allow the use of 'online' data

  • Web related

    • Content Delivery (CDN)

    • Pages requiring many low-locality queries returning small (define) sized data (Facebook, Myspace, Google, Yahoo, Ebay, etc)

    • Pages requiring many high-locality (or small dataset) queries returning small (define) sized data (CNN, Slashdot)

    • Pages consisting primarily of static content (Microsoft, IMDB, etc)

  • Raw Storage

Of the top 20 websites, these could likely largely benefit from RAMCloud

  • Google

  • Yahoo

  • YouTube (not the video part)

  • Facebook

  • Windows Live

  • MSN (how much is static and just cached?)

  • Wikipedia (maybe?)

  • Blogger.com

  • Myspace

  • Baidu (Chinese search engine)

  • Yahoo Japan

  • Google India

  • Google Germany

  • Google France

  • Google UK

  • WordPress.com

Concensus built around the following applications:

  • Pages requiring many low-locality queries returning small (define) sized data (Facebook, Myspace, Google, Yahoo, Ebay, etc)

  • Something in the highly transactional space (visa,paypal,etc) possibly involving live data, ala fraud detection

  • MapReduce, where you could run on live data, and/or not need to worry about shuffling data around inbetween machines to speed it up

  • Raw storage, as suggested by John