RPC Measurements Autumn 2011

RAMCloud Benchmarks

A single backup operation (ClusterPerf with 100-byte writes, 1 master, 3 backups)

On the Master

Averaged over 1912 sample timelines:

Timeline on a Master

0 us --- Begin backup
| 
|
2.0 us --- First write RPC sent out
| 
|
3.3 us --- Second write RPC sent out
| 
|
4.5 us --- Third write RPC sent out
| 
|
| [~ 4 us "dead time"]
|
|
8.6 us --- First write RPC completes (duration: 6.6 us)
|
9.8 us --- Second write RPC completes (duration: 6.5 us)
|
10.8 us --- Third write RPC completes (duration: 6.3 us)
10.9 us --- End backup

Major time sinks in issue path

Acquiring Dispatch::Lock in TransportManager::WorkerSession::clientSend for every write RPC
- Cost: 3 x ~250ns

InfRcTransport<Infiniband>::getTransmitBuffer(): waiting for free tx buffer for every write RPC
- Cost: 3 x ~200ns (first write RPC more expensive than 2nd and 3rd)

Calling into Infiniband transport: postSendZeroCopy (unavoidable?)
- Cost: 3 x ~400ns (first write RPC more expensive than 2nd and 3rd)

Benchmark IB Send vs. RDMA

Simple program to benchmark 56-byte write.

Averaged over 100 samples.

One-way (with completion on sender)

Using IB send: 2.753 us

Using RDMA: 2.50495 us

RTT (RPC-style)

Using IB send: 4.969 us (explains write RPC latency seen in RAMCloud: 5 + 1 = 6 us)

Using RDMA: 4.866 us

We see that a one-way RDMA easily beats the round-trip IB send's currently used RAMCloud RPC.