RAMCloud Benchmarks
A single backup operation (ClusterPerf with 100-byte writes, 1 master, 3 backups). For each operation, a timeline of events was logged. Not all timelines had the same "shape", as not all backup operations are handled by the same sequence of events. Thus, the most common timeline "shape" was chosen, and the timelines below represent the average of the most common timeline shape. This procedure was done for both the backup and the master.
On the Master
Averaged over 1912 same-shape timelines.
0 us --- Begin backup (BackupManager::sync()) | | 2.0 us --- First write RPC sent out | | 3.3 us --- Second write RPC sent out | | 4.5 us --- Third write RPC sent out | | | [~ 4 us "dead time"] | | 8.6 us --- First write RPC completes (duration: 6.6 us) | 9.8 us --- Second write RPC completes (duration: 6.5 us) | 10.8 us --- Third write RPC completes (duration: 6.3 us) 10.9 us --- End backup
Major time sinks in issue path
- Acquiring Dispatch::Lock in TransportManager::WorkerSession::clientSend for every write RPC
- Cost: 3 x ~250ns
- InfRcTransport<Infiniband>::getTransmitBuffer(): waiting for free tx buffer for every write RPC
- Cost: 3 x ~200ns (first write RPC more expensive than 2nd and 3rd)
- Calling into Infiniband transport: postSendZeroCopy (unavoidable?)
- Cost: 3 x ~400ns (first write RPC more expensive than 2nd and 3rd)
On the the Backup
Average over 9584 same-shape timelines.
0 us ---- InfRcTransport Poller picks ups incoming RPC [dispatch thread] | 255 ns -- Invoke service.handleRpc() [worker thread] | 833 ns -- Completed service.handleRpc() [worker thread] | 991 ns -- Begin sending reply [dispatch thread] | | | 1.8 us -- Completed sending reply [dispatch thread]
Benchmark IB Send vs. RDMA
Simple program to benchmark 56-byte write.
Averaged over 100 samples.
One-way (with completion on sender)
Using IB send: 2.753 us
Using RDMA: 2.50495 us
RTT (RPC-style)
Using IB send: 4.969 us (explains write RPC latency seen in RAMCloud: 5 + 1 = 6 us)
Using RDMA: 4.866 us
We see that a one-way RDMA easily beats the round-trip IB send's currently used RAMCloud RPC.