Perf benchmarks

Perf benchmarks

Perf is a program that contains several micro-benchmarks related to RAMCloud. It includes all of the server code but runs as a stand-alone program, not in a cluster configuration. This page contains the output of that program at various points in time, so it provides a history of RAMCloud micro-benchmark performance.

Note: for consistency, please take all measurements on rc20 and be sure to compile with DEBUG=no.

Nov 16, 2016

atomicIntCmpX 6.25ns Atomic<int>::compareExchange atomicIntInc 5.90ns Atomic<int>::inc atomicIntInc 5.94ns std::atomic<int>::fetch_add atomicIntLoad 0.59ns Atomic<int>::load atomicIntStore 0.57ns Atomic<int>::store atomicIntXchg 5.90ns Atomic<int>::exchange bMutexNoBlock 14.57ns std::mutex lock/unlock (no blocking) bufferAppendCopy1 7.99ns appendCopy 1 byte to a buffer bufferAppendCopy50 8.93ns appendCopy 50 bytes to a buffer bufferAppendCopy100 8.47ns appendCopy 100 bytes to a buffer bufferAppendCopy250 16.74ns appendCopy 250 bytes to a buffer bufferAppendCopy500 20.90ns appendCopy 500 bytes to a buffer bufferAppendExternal1 8.13ns appendExternal 1 byte to a buffer bufferAppendExternal50 8.15ns appendExternal 50 bytes to a buffer bufferAppendExternal100 7.84ns appendExternal 100 bytes to a buffer bufferAppendExternal250 7.76ns appendExternal 250 bytes to a buffer bufferAppendExternal500 8.25ns appendExternal 500 bytes to a buffer bufferBasic 10.88ns buffer create, add one chunk, delete bufferBasicAlloc 12.29ns buffer create, alloc block in chunk, delete bufferBasicCopy 12.02ns buffer create, copy small block, delete bufferCopy 14.25ns copy out 2 small chunks from buffer bufferExtendChunk 4.69ns buffer add onto existing chunk bufferGetStart 3.41ns Buffer::getStart bufferConstruct 5.90ns buffer stack allocation bufferReset 4.79ns Buffer::reset bufferCopyIterator2 6.46ns buffer iterate over 2 copied chunks, accessing 1 byte each bufferCopyIterator5 8.71ns buffer iterate over 5 copied chunks, accessing 1 byte each bufferExternalIterator2 8.96ns buffer iterate over 2 external chunks, accessing 1 byte each bufferExternalIterator5 23.46ns buffer iterate over 5 external chunks, accessing 1 byte each condPingPong 3.09us std::condition_variable round-trip cppAtomicExchg 6.64ns Exchange method on a C++ atomic_int cppAtomicLoad 0.60ns Read a C++ atomic_int cyclesToSeconds 6.30ns Convert a rdtsc result to (double) seconds cyclesToNanos 6.32ns Convert a rdtsc result to (uint64_t) nanoseconds dispatchPoll 9.36ns Dispatch::poll (no timers or pollers) div32 4.79ns 32-bit integer division instruction div64 20.22ns 64-bit integer division instruction functionCall 1.74ns Call a function that has not been inlined generateRandomNumber 17.21ns Call to randomNumberGenerator(x) genRandomString 1.49us Generate a random 100-byte value getThreadId 1.73ns Retrieve thread id via ThreadId::get getThreadIdSyscall 41.04ns Retrieve kernel thread id using syscall hashTableLookup 123.41ns Key lookup in a 1GB HashTable hashTableLookupPf 70.57ns Key lookup in a 1GB HashTable with prefetching lfence 3.00ns Lfence instruction lockInDispThrd 3.52ns Acquire/release Dispatch::Lock (in dispatch thread) lockNonDispThrd 324.23ns Acquire/release Dispatch::Lock (non-dispatch thread) mapCreate 21.15ns Create+delete entry in std::map mapLookup 14.87ns Lookup in std::map<uint64_t, uint64_t> memcpyCached100 14.52ns memcpy 100 bytes with hot/fixed dst and src memcpyCached1000 21.79ns memcpy 1000 bytes with hot/fixed dst and src memcpyCached10000 185.40ns memcpy 10000 bytes with hot/fixed dst and src memcpyCachedDst100 93.36ns memcpy 100 bytes with hot/fixed dst and cold src memcpyCachedDst1000 122.05ns memcpy 1000 bytes with hot/fixed dst and cold src memcpyCachedDst10000 356.61ns memcpy 10000 bytes with hot/fixed dst and cold src memcpyCold100 377.18ns memcpy 100 bytes with cold dst and src memcpyCold1000 659.35ns memcpy 1000 bytes with cold dst and src memcpyCold10000 3.18us memcpy 10000 bytes with cold dst and src murmur3 10.96ns 128-bit MurmurHash3 (64-bit optimised) on 1 byte of data murmur3 50.44ns 128-bit MurmurHash3 hash (64-bit optimised) on 256 bytes of data objectPoolAlloc 29.52ns Cost of new allocations from an ObjectPool (no destroys) objectPoolRealloc 4.57ns Cost of ObjectPool allocation after destroying an object pingConditionVar 1.61us Round-trip ping with std::condition_variable prefetch 35.85ns Prefetch instruction queueEstimator 2.23ns Recompute # bytes outstanding in queue rdtsc 6.74ns Read the fine-grain cycle counter segmentEntrySort 8.38ms Sort a Segment full of avg. 100-byte Objects by age segmentIterator 2.09ms Iterate a Segment full of avg. 100-byte Objects sessionRefCount 12.59ns Create/delete SessionRef serialize 55.41ns cpuid instruction for serialize sfence 1.41ns Sfence instruction spinLock 9.60ns Acquire/release SpinLock startStopTimer 33.43ns Start and stop a Dispatch::Timer spawnThread 8.89us Start and stop a thread throwInt 2.07us Throw an int throwIntNL 2.55us Throw an int in a function call throwException 2.03us Throw an Exception throwExceptionNL 2.81us Throw an Exception in a function call throwSwitch 6.04us Throw an Exception using ClientException::throwException timeTrace 7.56ns Record an event using TimeTrace unorderedMapCreate 79.05ns Create+delete entry in unordered_map unorderedMapLookup 14.62ns Lookup in std::unordered_map<uint64_t, uint64_t> vectorPushPop 3.90ns Push and pop a std::vector

June 21, 2016

atomicIntCmpX 6.23ns Atomic<int>::compareExchange atomicIntInc 5.85ns Atomic<int>::inc atomicIntInc 5.84ns std::atomic<int>::fetch_add atomicIntLoad 0.56ns Atomic<int>::load atomicIntStore 0.56ns Atomic<int>::store atomicIntXchg 5.84ns Atomic<int>::exchange bMutexNoBlock 14.46ns std::mutex lock/unlock (no blocking) bufferAppendCopy1 8.09ns appendCopy 1 byte to a buffer bufferAppendCopy50 8.96ns appendCopy 50 bytes to a buffer bufferAppendCopy100 9.09ns appendCopy 100 bytes to a buffer bufferAppendCopy250 18.08ns appendCopy 250 bytes to a buffer bufferAppendCopy500 22.56ns appendCopy 500 bytes to a buffer bufferAppendExternal1 8.24ns appendExternal 1 byte to a buffer bufferAppendExternal50 7.87ns appendExternal 50 bytes to a buffer bufferAppendExternal100 7.95ns appendExternal 100 bytes to a buffer bufferAppendExternal250 7.79ns appendExternal 250 bytes to a buffer bufferAppendExternal500 7.72ns appendExternal 500 bytes to a buffer bufferBasic 11.13ns buffer create, add one chunk, delete bufferBasicAlloc 11.99ns buffer create, alloc block in chunk, delete bufferBasicCopy 12.00ns buffer create, copy small block, delete bufferCopy 14.20ns copy out 2 small chunks from buffer bufferExtendChunk 4.56ns buffer add onto existing chunk bufferGetStart 3.34ns Buffer::getStart bufferConstruct 5.84ns buffer stack allocation bufferReset 4.72ns Buffer::reset bufferCopyIterator2 6.67ns buffer iterate over 2 copied chunks, accessing 1 byte each bufferCopyIterator5 9.18ns buffer iterate over 5 copied chunks, accessing 1 byte each bufferExternalIterator2 9.06ns buffer iterate over 2 external chunks, accessing 1 byte each bufferExternalIterator5 23.24ns buffer iterate over 5 external chunks, accessing 1 byte each condPingPong 2.98us std::condition_variable round-trip cppAtomicExchg 5.83ns Exchange method on a C++ atomic_int cppAtomicLoad 0.65ns Read a C++ atomic_int cyclesToSeconds 6.13ns Convert a rdtsc result to (double) seconds cyclesToNanos 6.35ns Convert a rdtsc result to (uint64_t) nanoseconds dispatchPoll 9.21ns Dispatch::poll (no timers or pollers) div32 4.72ns 32-bit integer division instruction div64 20.15ns 64-bit integer division instruction functionCall 1.67ns Call a function that has not been inlined generateRandomNumber 16.56ns Call to randomNumberGenerator(x) genRandomString 1.49us Generate a random 100-byte value getThreadId 1.95ns Retrieve thread id via ThreadId::get getThreadIdSyscall 41.15ns Retrieve kernel thread id using syscall hashTableLookup 124.09ns Key lookup in a 1GB HashTable hashTableLookupPf 73.98ns Key lookup in a 1GB HashTable with prefetching lfence 2.92ns Lfence instruction lockInDispThrd 2.79ns Acquire/release Dispatch::Lock (in dispatch thread) lockNonDispThrd 279.30ns Acquire/release Dispatch::Lock (non-dispatch thread) mapCreate 22.63ns Create+delete entry in std::map mapLookup 16.02ns Lookup in std::map<uint64_t, uint64_t> memcpyCached100 14.55ns memcpy 100 bytes with hot/fixed dst and src memcpyCached1000 21.82ns memcpy 1000 bytes with hot/fixed dst and src memcpyCached10000 185.16ns memcpy 10000 bytes with hot/fixed dst and src memcpyCachedDst100 93.82ns memcpy 100 bytes with hot/fixed dst and cold src memcpyCachedDst1000 122.30ns memcpy 1000 bytes with hot/fixed dst and cold src memcpyCachedDst10000 355.23ns memcpy 10000 bytes with hot/fixed dst and cold src memcpyCold100 372.20ns memcpy 100 bytes with cold dst and src memcpyCold1000 655.00ns memcpy 1000 bytes with cold dst and src memcpyCold10000 3.18us memcpy 10000 bytes with cold dst and src murmur3 10.28ns 128-bit MurmurHash3 (64-bit optimised) on 1 byte of data murmur3 49.72ns 128-bit MurmurHash3 hash (64-bit optimised) on 256 bytes of data objectPoolAlloc 29.16ns Cost of new allocations from an ObjectPool (no destroys) objectPoolRealloc 4.48ns Cost of ObjectPool allocation after destroying an object pingConditionVar 1.56us Round-trip ping with std::condition_variable prefetch 35.20ns Prefetch instruction rdtsc 6.67ns Read the fine-grain cycle counter segmentEntrySort 8.14ms Sort a Segment full of avg. 100-byte Objects by age segmentIterator 1.97ms Iterate a Segment full of avg. 100-byte Objects sessionRefCount 12.55ns Create/delete SessionRef serialize 55.34ns cpuid instruction for serialize sfence 1.39ns Sfence instruction spinLock 18.90ns Acquire/release SpinLock startStopTimer 56.46ns Start and stop a Dispatch::Timer spawnThread 8.90us Start and stop a thread throwInt 2.02us Throw an int throwIntNL 2.48us Throw an int in a function call throwException 1.97us Throw an Exception throwExceptionNL 2.78us Throw an Exception in a function call throwSwitch 6.05us Throw an Exception using ClientException::throwException timeTrace 15.93ns Record an event using TimeTrace unorderedMapCreate 72.63ns Create+delete entry in unordered_map unorderedMapLookup 13.07ns Lookup in std::unordered_map<uint64_t, uint64_t> vectorPushPop 3.80ns Push and pop a std::vector

June 9, 2014

atomicIntCmpX 6.21ns Atomic<int>::compareExchange atomicIntInc 5.98ns Atomic<int>::inc atomicIntLoad 0.56ns Atomic<int>::load atomicIntStore 0.56ns Atomic<int>::store atomicIntXchg 5.97ns Atomic<int>::exchange bMutexNoBlock 14.77ns std::mutex lock/unlock (no blocking) bufferBasic 11.42ns buffer create, add one chunk, delete bufferBasicAlloc 14.62ns buffer create, alloc block in chunk, delete bufferBasicCopy 13.24ns buffer create, copy small block, delete bufferCopy 14.76ns copy out 2 small chunks from buffer bufferExtendChunk 5.69ns buffer add onto existing chunk bufferGetStart 3.34ns Buffer::getStart bufferIterator 22.57ns iterate over buffer with 5 chunks condPingPong 4.04us std::condition_variable round-trip cppAtomicExchg 5.87ns Exchange method on a C++ atomic_int cppAtomicLoad 13.50ns Read a C++ atomic_int cyclesToSeconds 6.97ns Convert a rdtsc result to (double) seconds cyclesToNanos 9.75ns Convert a rdtsc result to (uint64_t) nanoseconds dispatchPoll 8.63ns Dispatch::poll (no timers or pollers) div32 4.86ns 32-bit integer division instruction div64 20.30ns 64-bit integer division instruction functionCall 1.95ns Call a function that has not been inlined getThreadId 2.24ns Retrieve thread id via ThreadId::get hashTableLookup 128.86ns Key lookup in a 1GB HashTable hashTableLookupPf 78.07ns Key lookup in a 1GB HashTable with prefetching lfence 2.82ns Lfence instruction lockInDispThrd 4.06ns Acquire/release Dispatch::Lock (in dispatch thread) lockNonDispThrd 214.54ns Acquire/release Dispatch::Lock (non-dispatch thread) memcpy100 8.78ns Copy 100 bytes with memcpy memcpy1000 37.37ns Copy 1000 bytes with memcpy memcpy10000 191.07ns Copy 10000 bytes with memcpy murmur3 10.31ns 128-bit MurmurHash3 (64-bit optimised) on 1 byte of data murmur3 40.47ns 128-bit MurmurHash3 hash (64-bit optimised) on 256 bytes of data objectPoolAlloc 27.86ns Cost of new allocations from an ObjectPool (no destroys) objectPoolRealloc 4.65ns Cost of ObjectPool allocation after destroying an object prefetch 34.72ns Prefetch instruction rdtsc 6.70ns Read the fine-grain cycle counter segmentEntrySort 7.02ms Sort a Segment full of avg. 100-byte Objects by age segmentIterator 1.83ms Iterate a Segment full of avg. 100-byte Objects sfence 1.39ns Sfence instruction spinLock 16.17ns Acquire/release SpinLock startStopTimer 14.76ns Start and stop a Dispatch::Timer spawnThread 9.89us Start and stop a thread throwInt 1.91us Throw an int throwIntNL 2.60us Throw an int in a function call throwException 1.84us Throw an Exception throwExceptionNL 2.58us Throw an Exception in a function call throwSwitch 5.46us Throw an Exception using ClientException::throwException vectorPushPop 3.72ns Push and pop a std::vector

June 18, 2012

atomicIntCmpX 6.29ns Atomic<int>::compareExchange atomicIntInc 5.93ns Atomic<int>::inc atomicIntLoad 0.56ns Atomic<int>::load atomicIntStore 0.59ns Atomic<int>::store atomicIntXchg 5.93ns Atomic<int>::exchange bMutexNoBlock 14.92ns std::mutex lock/unlock (no blocking) cppAtomicExchg 5.93ns Exchange method on a C++ atomic_int cppAtomicLoad 13.76ns Read a C++ atomic_int cyclesToSeconds 7.04ns Convert a rdtsc result to (double) seconds cyclesToNanos 9.86ns Convert a rdtsc result to (uint64_t) nanoseconds dispatchPoll 8.17ns Dispatch::poll (no timers or pollers) functionCall 1.98ns Call a function that has not been inlined getThreadId 1.98ns Retrieve thread id via ThreadId::get hashTableLookup 168.76ns Key lookup in a 1GB HashTable hashTableLookupPf 143.19ns Key lookup in a 1GB HashTable with prefetching lfence 2.90ns Lfence instruction lockInDispThrd 4.15ns Acquire/release Dispatch::Lock (in dispatch thread) lockNonDispThrd 4.05ns Acquire/release Dispatch::Lock (non-dispatch thread) murmur3 11.29ns 128-bit MurmurHash3 (64-bit optimised) on 1 byte of data murmur3 41.17ns 128-bit MurmurHash3 hash (64-bit optimised) on 256 bytes of data objectPoolAlloc 192.56ns Cost of new allocations from an ObjectPool (no destroys) objectPoolRealloc 4.42ns Cost of ObjectPool allocation after destroying an object prefetch 29.79ns Prefetch instruction rdtsc 6.80ns Read the fine-grain cycle counter segmentEntrySort 5.18ms Sort a Segment full of avg. 100-byte Objects by age segmentIterator 1.08ms Iterate a Segment full of avg. 100-byte Objects sfence 1.47ns Sfence instruction spinLock 16.34ns Acquire/release SpinLock startStopTimer 15.03ns Start and stop a Dispatch::Timer throwInt 1.97us Throw an int throwIntNL 2.44us Throw an int in a function call throwException 1.88us Throw an Exception throwExceptionNL 2.60us Throw an Exception in a function call throwSwitch 5.09us Throw an Exception using ClientException::throwException vectorPushPop 3.86ns Push and pop a std::vector

July 18, 2011

atomicIntCmpX 6.26ns AtomicInt::compareExchange atomicIntInc 6.05ns AtomicInt::inc atomicIntLoad 0.57ns AtomicInt::load atomicIntStore 0.58ns AtomicInt::store atomicIntXchg 6.06ns AtomicInt::exchange bMutexNoBlock 14.66ns Boost mutex lock/unlock (no blocking) cppAtomicExchg 5.93ns Exchange method on a C++ atomic_int cppAtomicLoad 13.66ns Read a C++ atomic_int dispatchPoll 7.62ns Dispatch::poll (no timers or pollers) getThreadId 2.27ns Retrieve thread id via ThreadId::get lfence 2.84ns Lfence instruction lockInDispThrd 7.05ns Acquire/release Dispatch::Lock (in dispatch thread) lockNonDispThrd 252.79ns Acquire/release Dispatch::Lock (non-dispatch thread) sfence 1.50ns Sfence instruction spinLock 16.46ns Acquire/release SpinLock

Misc

Some 2.3GHz Xeon, 2011-08-23

atomicIntCmpX 10.63ns AtomicInt::compareExchange atomicIntInc 8.71ns AtomicInt::inc atomicIntLoad 0.44ns AtomicInt::load atomicIntStore 0.43ns AtomicInt::store atomicIntXchg 7.83ns AtomicInt::exchange bMutexNoBlock 42.26ns Boost mutex lock/unlock (no blocking) cppAtomicExchg 7.97ns Exchange method on a C++ atomic_int cppAtomicLoad 7.69ns Read a C++ atomic_int cyclesToSeconds 10.42ns Convert a rdtsc result to (double) seconds cyclesToNanos 14.49ns Convert a rdtsc result to (uint64_t) nanoseconds dispatchPoll 20.41ns Dispatch::poll (no timers or pollers) functionCall 3.05ns Call a function that has not been inlined getThreadId 3.04ns Retrieve thread id via ThreadId::get lfence 4.79ns Lfence instruction lockInDispThrd 6.10ns Acquire/release Dispatch::Lock (in dispatch thread) lockNonDispThrd 283.43ns Acquire/release Dispatch::Lock (non-dispatch thread) sfence 3.60ns Sfence instruction spinLock 25.17ns Acquire/release SpinLock throwInt 3.32us Throw an int throwIntNL 4.54us Throw an int in a function call throwException 3.43us Throw an Exception throwExceptionNL 4.69us Throw an Exception in a function call throwSwitch 6.45us Throw an Exception using ClientException::throwException

 

bufferAppendCopy50          8.93ns         appendCopy 50 bytes to a bufferbufferAppendCopy100         8.47ns         appendCopy 100 bytes to a bufferbufferAppendCopy250        16.74ns         appendCopy 250 bytes to a bufferbufferAppendCopy500        20.90ns         appendCopy 500 bytes to a bufferbufferAppendExternal1       8.13ns         appendExternal 1 byte to a bufferbufferAppendExternal50      8.15ns         appendExternal 50 bytes to a bufferbufferAppendExternal100     7.84ns         appendExternal 100 bytes to a bufferbufferAppendExternal250     7.76ns         appendExternal 250 bytes to a bufferbufferAppendExternal500     8.25ns         appendExternal 500 bytes to a bufferbufferBasic                10.88ns         buffer create, add one chunk, deletebufferBasicAlloc           12.29ns         buffer create, alloc block in chunk, deletebufferBasicCopy            12.02ns         buffer create, copy small block, deletebufferCopy                 14.25ns         copy out 2 small chunks from bufferbufferExtendChunk           4.69ns         buffer add onto existing chunkbufferGetStart              3.41ns         Buffer::getStartbufferConstruct             5.90ns         buffer stack allocationbufferReset                 4.79ns         Buffer::resetbufferCopyIterator2         6.46ns         buffer iterate over 2 copied chunks, accessing 1 byte eachbufferCopyIterator5         8.71ns         buffer iterate over 5 copied chunks, accessing 1 byte eachbufferExternalIterator2     8.96ns         buffer iterate over 2 external chunks, accessing 1 byte eachbufferExternalIterator5    23.46ns         buffer iterate over 5 external chunks, accessing 1 byte eachcondPingPong                3.09us         std::condition_variable round-tripcppAtomicExchg              6.64ns         Exchange method on a C++ atomic_intcppAtomicLoad               0.60ns         Read a C++ atomic_intcyclesToSeconds             6.30ns         Convert a rdtsc result to (double) secondscyclesToNanos               6.32ns         Convert a rdtsc result to (uint64_t) nanosecondsdispatchPoll                9.36ns         Dispatch::poll (no timers or pollers)div32                       4.79ns         32-bit integer division instructiondiv64                      20.22ns         64-bit integer division instructionfunctionCall                1.74ns         Call a function that has not been inlinedgenerateRandomNumber       17.21ns         Call to randomNumberGenerator(x)genRandomString             1.49us         Generate a random 100-byte valuegetThreadId                 1.73ns         Retrieve thread id via ThreadId::getgetThreadIdSyscall         41.04ns         Retrieve kernel thread id using syscallhashTableLookup           123.41ns         Key lookup in a 1GB HashTablehashTableLookupPf          70.57ns         Key lookup in a 1GB HashTable with prefetchinglfence                      3.00ns         Lfence instructionlockInDispThrd              3.52ns         Acquire/release Dispatch::Lock (in dispatch thread)lockNonDispThrd           324.23ns         Acquire/release Dispatch::Lock (non-dispatch thread)mapCreate                  21.15ns         Create+delete entry in std::mapmapLookup                  14.87ns         Lookup in std::map<uint64_t, uint64_t>memcpyCached100            14.52ns         memcpy 100 bytes with hot/fixed dst and srcmemcpyCached1000           21.79ns         memcpy 1000 bytes with hot/fixed dst and srcmemcpyCached10000         185.40ns         memcpy 10000 bytes with hot/fixed dst and srcmemcpyCachedDst100         93.36ns         memcpy 100 bytes with hot/fixed dst and cold srcmemcpyCachedDst1000       122.05ns         memcpy 1000 bytes with hot/fixed dst and cold srcmemcpyCachedDst10000      356.61ns         memcpy 10000 bytes with hot/fixed dst and cold srcmemcpyCold100             377.18ns         memcpy 100 bytes with cold dst and srcmemcpyCold1000            659.35ns         memcpy 1000 bytes with cold dst and srcmemcpyCold10000             3.18us         memcpy 10000 bytes with cold dst and srcmurmur3                    10.96ns         128-bit MurmurHash3 (64-bit optimised) on 1 byte of datamurmur3                    50.44ns         128-bit MurmurHash3 hash (64-bit optimised) on 256 bytes of dataobjectPoolAlloc            29.52ns         Cost of new allocations from an ObjectPool (no destroys)objectPoolRealloc           4.57ns         Cost of ObjectPool allocation after destroying an objectpingConditionVar            1.61us         Round-trip ping with std::condition_variableprefetch                   35.85ns         Prefetch instructionqueueEstimator              2.23ns         Recompute # bytes outstanding in queuerdtsc                       6.74ns         Read the fine-grain cycle countersegmentEntrySort            8.38ms         Sort a Segment full of avg. 100-byte Objects by agesegmentIterator             2.09ms         Iterate a Segment full of avg. 100-byte ObjectssessionRefCount            12.59ns         Create/delete SessionRefserialize                  55.41ns         cpuid instruction for serializesfence                      1.41ns         Sfence instructionspinLock                    9.60ns         Acquire/release SpinLockstartStopTimer             33.43ns         Start and stop a Dispatch::TimerspawnThread                 8.89us         Start and stop a threadthrowInt                    2.07us         Throw an intthrowIntNL                  2.55us         Throw an int in a function callthrowException              2.03us         Throw an ExceptionthrowExceptionNL            2.81us         Throw an Exception in a function callthrowSwitch                 6.04us         Throw an Exception using ClientException::throwExceptiontimeTrace                   7.56ns         Record an event using TimeTraceunorderedMapCreate         79.05ns         Create+delete entry in unordered_mapunorderedMapLookup         14.62ns         Lookup in std::unordered_map<uint64_t, uint64_t>vectorPushPop               3.90ns         Push and pop a std::vector