Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page is intended for recording steps we have taken over time to improve RAMCloud performance, along with measurements of the resulting performance gains. Add new entries at the beginning of the page, so that the entries are in reverse chronological order.

Prefetching on Incoming Packet and Log Entry (June 2014, Henry Qin)

After various experiments isolating Last Level Cache misses, and the addition of the randomized read benchmark readDistRandom, we added prefetching on the incoming packet whenever it is shorter than 1000 bytes, as well as prefetching on the Log entry whenever it is less than 300 bytes.

This reduces the median read time on `clusterperf readDist` by 30 ns, while reducing the median read time on `clusterperf readDistRandom` by 190 ns.

No Format
                Old      New
readDist        4.67us   4.64us
readDistRandom  4.94us   4.75us

ObjectFinder and TransportManager (June 2014, John Ousterhout)

...

The median read time in "clusterperf readDist" dropped about 40ns as a result of these changes (from 4.95µs to 4.91µs).

 

...

)

...

After various experiments isolating Last Level Cache misses, and the addition of the randomized read benchmark readDistRandom, we added prefetching on the incoming packet whenever it is shorter than 1000 bytes, as well as prefetching on the Log entry whenever it is less than 300 bytes.

This reduces the median read time on `clusterperf readDist` by 30 ns, while reducing the median read time on `clusterperf readDistRandom` by 190 ns.

...