...
Major time sinks in issue path
- Acquiring Dispatch::Lock in TransportManager::WorkerSession::clientSend for every write RPC
- Cost: 3 x ~300ns
- InfRcTransport<Infiniband>::getTransmitBuffer(): waiting for free tx buffer for every write RPC
- Cost: 3 x ~200ns (first write RPC more expensive than 2nd and 3rd)
- Calling into Infiniband transport: postSendZeroCopy (unavoidable?)
- Cost: 3 x ~400ns (first write RPC more expensive than 2nd and 3rd)
More detailed timeline
Below is a more detailed version of the timeline presented above. Each event in the timeline is associated with a (cumulative) distribution of "completion" times. The "completion" time for a given event is the time delta between that event and its preceding event; hence, the cdf for the first event is quite boring. The major time sinks described above are labeled by their corresponding number. (Note: the x-axis in the cdf's below is in seconds.)
...
Event Invoking session->clientSend()
Event Acquired Dispatch::Lock in TransportManager::WorkerSession::
...
clientSend [1]
Event Invoking Instantiate new ClientRpc
...
Event Invoking rpc->sendOrQueue
Event Invoking
...
postSendZeroCopy [2]
Event Completed
...
postSendZeroCopy [3]
Event Completed rpc->sendOrQueue
...
Event Invoking session->clientSend()
Event Acquired Dispatch::Lock in TransportManager::WorkerSession::
...
clientSend [1]
Event Invoking Instantiate new ClientRpc
...
Event Invoking rpc->sendOrQueue
Event Invoking
...
postSendZeroCopy [2]
Event Completed
...
postSendZeroCopy [3]
Event Completed rpc->sendOrQueue
...
Event Invoking session->clientSend()
Event Acquired Dispatch::Lock in TransportManager::WorkerSession::
...
clientSend [1]
Event Invoking Instantiate new ClientRpc
...
Event Invoking rpc->sendOrQueue
Event Invoking
...
postSendZeroCopy [2]
Event Completed
...
postSendZeroCopy [3]
Event Completed rpc->sendOrQueue
...