Recovery Performance

Recovery of 640 MB of log data recovered by RAMCloud 35d8b. All cases are using an in-memory replica unless the legend hints otherwise.

Left to right the dots correspond to 8192, 4096, 2048, 1024, 512, 256, and 128 B objects.

RPC Count: RPCs during recovery for 1 backup is almost exactly 5 * segment count ignoring RPCs for client tablet map refreshes (80 segments here so 400 RPCs).

After commit e427b featuring prefetching during iteration during recoverSegment and crc32. The new checksumming is fast enough that it is on for all experiments above except the one case that is noted.

MasterServer::recoverSegment Performance

rabinpoly was discovered to be far too slow to use as our checksum. The attached graph demonstrates the overheads of various checksum routines on a 640MB (80x8MB segment) invocation of recoverSegment as simulated in the RecoverSegmentBenchmark application.

Take aways:

  • For small objects, either crc32c as we go, or perhaps vmac if we checksum only on close (i.e. do the whole segment in one shot, rather than incrementally).
  • For larger objects, it doesn't matter too much.

See the pdf to zoom in: http://fiz.stanford.edu:8081/download/attachments/7798899/recoverSegment_checksums.pdf