Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Note: for consistency, please take all measurements on rc20 and be sure to compile with DEBUG=no.

...

Nov 16, 2016

No Format
nopaneltrue
atomicIntCmpX               6.23ns25ns         Atomic<int>::compareExchange
atomicIntInc                5.85ns90ns         Atomic<int>::inc
atomicIntInc                5.84ns94ns         std::atomic<int>::fetch_add
atomicIntLoad               0.56ns59ns         Atomic<int>::load
atomicIntStore              0.56ns57ns         Atomic<int>::store
atomicIntXchg               5.84ns90ns         Atomic<int>::exchange
bMutexNoBlock              14.46ns57ns         std::mutex lock/unlock (no blocking)
bufferAppendCopy1           87.09ns99ns         appendCopy 1 byte to a buffer
bufferAppendCopy50          8.96ns93ns         appendCopy 50 bytes to a buffer
bufferAppendCopy100         98.09ns47ns         appendCopy 100 bytes to a buffer
bufferAppendCopy250        1816.08ns74ns         appendCopy 250 bytes to a buffer
bufferAppendCopy500        2220.56ns90ns         appendCopy 500 bytes to a buffer
bufferAppendExternal1       8.24ns13ns         appendExternal 1 byte to a buffer
bufferAppendExternal50      78.87ns15ns         appendExternal 50 bytes to a buffer
bufferAppendExternal100     7.95ns84ns         appendExternal 100 bytes to a buffer
bufferAppendExternal250     7.79ns76ns         appendExternal 250 bytes to a buffer
bufferAppendExternal500     78.72ns25ns         appendExternal 500 bytes to a buffer
bufferBasic                1110.13ns88ns         buffer create, add one chunk, delete
bufferBasicAlloc           1112.99ns29ns         buffer create, alloc block in chunk, delete
bufferBasicCopy            12.00ns02ns         buffer create, copy small block, delete
bufferCopy                 14.20ns25ns         copy out 2 small chunks from buffer
bufferExtendChunk           4.56ns69ns         buffer add onto existing chunk
bufferGetStart              3.34ns41ns         Buffer::getStart
bufferConstruct             5.84ns90ns         buffer stack allocation
bufferReset                 4.72ns79ns         Buffer::reset
bufferCopyIterator2         6.67ns46ns         buffer iterate over 2 copied chunks, accessing 1 byte each
bufferCopyIterator5         98.18ns71ns         buffer iterate over 5 copied chunks, accessing 1 byte each
bufferExternalIterator2     98.06ns96ns         buffer iterate over 2 external chunks, accessing 1 byte each
bufferExternalIterator5    23.24ns46ns         buffer iterate over 5 external chunks, accessing 1 byte each
condPingPong                23.98us09us         std::condition_variable round-trip
cppAtomicExchg              56.83ns64ns         Exchange method on a C++ atomic_int
cppAtomicLoad               0.65ns60ns         Read a C++ atomic_int
cyclesToSeconds             6.13ns30ns         Convert a rdtsc result to (double) seconds
cyclesToNanos               6.35ns32ns         Convert a rdtsc result to (uint64_t) nanoseconds
dispatchPoll                9.21ns36ns         Dispatch::poll (no timers or pollers)
div32                       4.72ns79ns         32-bit integer division instruction
div64                      20.15ns22ns         64-bit integer division instruction
functionCall                1.67ns74ns         Call a function that has not been inlined
generateRandomNumber       1617.56ns21ns         Call to randomNumberGenerator(x)
genRandomString             1.49us         Generate a random 100-byte value
getThreadId                 1.95ns73ns         Retrieve thread id via ThreadId::get
getThreadIdSyscall         41.15ns04ns         Retrieve kernel thread id using syscall
hashTableLookup           124123.09ns41ns         Key lookup in a 1GB HashTable
hashTableLookupPf          7370.98ns57ns         Key lookup in a 1GB HashTable with prefetching
lfence                      23.92ns00ns         Lfence instruction
lockInDispThrd              23.79ns52ns         Acquire/release Dispatch::Lock (in dispatch thread)
lockNonDispThrd           279324.30ns23ns         Acquire/release Dispatch::Lock (non-dispatch thread)
mapCreate                  2221.63ns15ns         Create+delete entry in std::map
mapLookup                  1614.02ns87ns         Lookup in std::map<uint64_t, uint64_t>
memcpyCached100            14.55ns52ns         memcpy 100 bytes with hot/fixed dst and src
memcpyCached1000           21.82ns79ns         memcpy 1000 bytes with hot/fixed dst and src
memcpyCached10000         185.16ns40ns         memcpy 10000 bytes with hot/fixed dst and src
memcpyCachedDst100         93.82ns36ns         memcpy 100 bytes with hot/fixed dst and cold src
memcpyCachedDst1000       122.30ns05ns         memcpy 1000 bytes with hot/fixed dst and cold src
memcpyCachedDst10000      355356.23ns61ns         memcpy 10000 bytes with hot/fixed dst and cold src
memcpyCold100             372377.20ns18ns         memcpy 100 bytes with cold dst and src
memcpyCold1000            655659.00ns35ns         memcpy 1000 bytes with cold dst and src
memcpyCold10000             3.18us         memcpy 10000 bytes with cold dst and src
murmur3                    10.28ns96ns         128-bit MurmurHash3 (64-bit optimised) on 1 byte of data
murmur3                    4950.72ns44ns         128-bit MurmurHash3 hash (64-bit optimised) on 256 bytes of data
objectPoolAlloc            29.16ns52ns         Cost of new allocations from an ObjectPool (no destroys)
objectPoolRealloc           4.48ns57ns         Cost of ObjectPool allocation after destroying an object
pingConditionVar            1.56us61us         Round-trip ping with std::condition_variable
prefetch                   35.20ns85ns         Prefetch instruction
rdtscqueueEstimator              2.23ns         6.67nsRecompute # bytes outstanding in queue
rdtsc   Read                     6.74ns         Read the fine-grain cycle counter
segmentEntrySort            8.14ms38ms         Sort a Segment full of avg. 100-byte Objects by age
segmentIterator             12.97ms09ms         Iterate a Segment full of avg. 100-byte Objects
sessionRefCount            12.55ns59ns         Create/delete SessionRef
serialize                  55.34ns41ns         cpuid instruction for serialize
sfence                      1.39ns41ns         Sfence instruction
spinLock                    189.90ns60ns         Acquire/release SpinLock
startStopTimer             5633.46ns43ns         Start and stop a Dispatch::Timer
spawnThread                 8.90us89us         Start and stop a thread
throwInt                    2.02us07us         Throw an int
throwIntNL                  2.48us55us         Throw an int in a function call
throwException              12.97us03us         Throw an Exception
throwExceptionNL            2.78us81us         Throw an Exception in a function call
throwSwitch                 6.05us04us         Throw an Exception using ClientException::throwException
timeTrace                   157.93ns56ns         Record an event using TimeTrace
unorderedMapCreate         7279.63ns05ns         Create+delete entry in unordered_map
unorderedMapLookup         1314.07ns62ns         Lookup in std::unordered_map<uint64_t, uint64_t>
vectorPushPop               3.80ns90ns         Push and pop a std::vector

June

...

21,

...

2016

No Format
nopaneltrue
atomicIntCmpX          6.21ns     Atomic<int>6.23ns         Atomic<int>::compareExchange
atomicIntInc                5.98ns85ns         Atomic<int>::inc
atomicIntLoad
atomicIntInc                0.56ns5.84ns         Atomic<int>std::atomic<int>::load
atomicIntStorefetch_add
atomicIntLoad               0.56ns    Atomic<int>::store atomicIntXchg    Atomic<int>::load
atomicIntStore     5.97ns    Atomic<int>::exchange bMutexNoBlock    0.56ns     14.77ns    stdAtomic<int>::mutexstore
lock/unlockatomicIntXchg (no blocking) bufferBasic           11 5.42ns84ns    buffer create, add one chunk, delete
bufferBasicAlloc Atomic<int>::exchange
bMutexNoBlock      14.62ns    buffer create, alloc block in14.46ns chunk, delete bufferBasicCopy       13.24ns    buffer create, copy small block, delete
bufferCopystd::mutex lock/unlock (no blocking)
bufferAppendCopy1           8.09ns        14.76ns appendCopy 1 byte copyto outa 2buffer
smallbufferAppendCopy50 chunks from buffer bufferExtendChunk      58.69ns96ns    buffer add onto existing chunk bufferGetStartappendCopy 50 bytes to a buffer
bufferAppendCopy100   3.34ns     Buffer::getStart
bufferIterator 9.09ns        22.57ns appendCopy 100 bytes iterateto overa buffer
bufferAppendCopy250 with  5 chunks condPingPong   18.08ns        4.04us appendCopy 250  std::condition_variable round-trip
cppAtomicExchg bytes to a buffer
bufferAppendCopy500        522.87ns56ns    Exchange method on a C++ atomic_int
cppAtomicLoad      appendCopy 500 bytes to a buffer
bufferAppendExternal1    13.50ns   8.24ns Read a C++ atomic_int cyclesToSeconds    appendExternal 1 byte to 6.97nsa buffer
bufferAppendExternal50  Convert a rdtsc result to (double) seconds
cyclesToNanos7.87ns         appendExternal 50 bytes to 9.75nsa buffer
bufferAppendExternal100  Convert a rdtsc result7.95ns to (uint64_t) nanoseconds dispatchPoll     appendExternal 100 bytes to a buffer
8.63nsbufferAppendExternal250    Dispatch::poll (no timers or pollers)
div32 7.79ns         appendExternal 250 bytes to a buffer
bufferAppendExternal500     7.72ns  4.86ns    32-bit integer division instructionappendExternal div64500 bytes to a buffer
bufferBasic            20.30ns    64-bit integer division instruction
functionCall11.13ns         buffer create, add one chunk, delete
1.95nsbufferBasicAlloc    Call a function that has not been inlined11.99ns getThreadId        buffer create, alloc block 2.24nsin chunk, delete
bufferBasicCopy Retrieve thread id via ThreadId::get hashTableLookup      12812.86ns00ns    Key lookup in a 1GB HashTablebuffer hashTableLookupPfcreate, copy small block, delete
78.07nsbufferCopy    Key lookup in a 1GB HashTable with prefetching lfence      14.20ns         copy out 2.82ns small chunks from buffer
LfencebufferExtendChunk instruction lockInDispThrd         4.06ns56ns    Acquire/release Dispatch::Lock (in dispatch thread) lockNonDispThrdbuffer add onto existing chunk
bufferGetStart 214.54ns    Acquire/release Dispatch::Lock (non-dispatch thread) memcpy100     3.34ns         8.78ns Buffer::getStart
bufferConstruct   Copy 100 bytes with memcpy memcpy1000     5.84ns       37.37ns  buffer stack Copyallocation
1000bufferReset bytes with memcpy memcpy10000          191.07ns   4.72ns Copy 10000 bytes with memcpy murmur3   Buffer::reset
bufferCopyIterator2           106.31ns67ns    128-bit MurmurHash3 (64-bit optimised) on 1buffer byteiterate ofover data2 murmur3copied chunks, accessing 1 byte each
bufferCopyIterator5         409.47ns18ns    128-bit MurmurHash3 hash (64-bit optimised) onbuffer 256iterate bytesover of5 datacopied objectPoolAllocchunks, accessing 1 byte each
bufferExternalIterator2  27.86ns   9.06ns Cost of new allocations from an ObjectPool (no destroys)buffer objectPoolReallociterate over 2 external chunks,  4.65ns  accessing 1 byte each
bufferExternalIterator5  Cost of ObjectPool23.24ns allocation after destroying an object prefetch   buffer iterate over 5 external chunks, accessing 1 byte each
condPingPong 34.72ns    Prefetch instruction rdtsc         2.98us         6.70ns  std::condition_variable round-trip
cppAtomicExchg  Read the fine-grain cycle counter segmentEntrySort       75.02ms83ns    Sort a Segment full of avg. 100-byte Objects by age
segmentIterator Exchange method on a C++ atomic_int
cppAtomicLoad       1.83ms    Iterate a Segment full of avg0.65ns 100-byte Objects sfence      Read a C++ atomic_int
cyclesToSeconds       1.39ns    Sfence instruction spinLock6.13ns         Convert a rdtsc result  16.17ns    Acquire/release SpinLock
startStopTimerto (double) seconds
cyclesToNanos            14.76ns   6.35ns Start and stop a Dispatch::Timer spawnThread   Convert a rdtsc result to (uint64_t) nanoseconds
dispatchPoll  9.89us    Start and stop a thread throwInt     9.21ns          1.91us    Throw an int
throwIntNLDispatch::poll (no timers or pollers)
div32                       24.60us72ns     Throw an int in a32-bit functioninteger calldivision throwExceptioninstruction
div64        1.84us    Throw an Exception throwExceptionNL       220.58us15ns      Throw an Exception in64-bit ainteger functiondivision callinstruction
throwSwitchfunctionCall            5.46us    Throw1.67ns an Exception using ClientException::throwException vectorPushPop    Call a function that has not 3.72nsbeen inlined
generateRandomNumber  Push and pop a std::vector

June 18, 2012

No Format
atomicIntCmpX  16.56ns         6.29nsCall to randomNumberGenerator(x)
genRandomString Atomic<int>::compareExchange atomicIntInc           51.93ns 49us   Atomic<int>::inc atomicIntLoad     Generate a random 100-byte value
0.56nsgetThreadId    Atomic<int>::load atomicIntStore         0.59ns    Atomic<int>::store
atomicIntXchg1.95ns         Retrieve thread id 5.93ns    Atomic<int>via ThreadId::exchangeget
bMutexNoBlockgetThreadIdSyscall         1441.92ns15ns    std::mutex lock/unlock (no blocking) cppAtomicExchg Retrieve kernel thread id using syscall
hashTableLookup  5.93ns    Exchange method on a C++ atomic_int
cppAtomicLoad 124.09ns         Key 13.76nslookup in a 1GB Read a C++ atomic_int
cyclesToSecondsHashTable
hashTableLookupPf          773.04ns98ns    Convert a rdtsc result to (double)Key secondslookup cyclesToNanosin a 1GB HashTable with prefetching
lfence    9.86ns    Convert a rdtsc result to (uint64_t) nanoseconds dispatchPoll       2.92ns    8.17ns    Dispatch::poll (no timers or pollers)
functionCall Lfence instruction
lockInDispThrd              12.98ns79ns     Call a function that has not been inlined
getThreadId Acquire/release Dispatch::Lock (in dispatch thread)
lockNonDispThrd           1279.98ns30ns      Retrieve thread id viaAcquire/release ThreadIdDispatch::get
hashTableLookup  Lock (non-dispatch thread)
mapCreate    168.76ns    Key lookup in a 1GB HashTable hashTableLookupPf    14322.19ns63ns    Key lookup in a 1GB HashTableCreate+delete withentry prefetching
lfence in std::map
mapLookup                2.90ns  16.02ns  Lfence instruction lockInDispThrd     Lookup in std::map<uint64_t,  4.15nsuint64_t>
memcpyCached100    Acquire/release Dispatch::Lock (in dispatch thread) lockNonDispThrd   14.55ns     4.05ns    Acquire/release Dispatch::Lock (non-dispatch thread)
murmur3    memcpy 100 bytes with hot/fixed dst and src
memcpyCached1000           1121.29ns82ns    128-bit MurmurHash3 (64-bit optimised) on 1memcpy byte1000 ofbytes datawith murmur3hot/fixed dst and src
memcpyCached10000           41185.17ns16ns    128-bit MurmurHash3 hash (64-bit optimised) onmemcpy 25610000 bytes ofwith datahot/fixed objectPoolAllocdst and src
memcpyCachedDst100   192.56ns    Cost of new93.82ns allocations from an ObjectPool (no destroys) objectPoolRealloc  memcpy 100 bytes  4.42ns    Cost of ObjectPool allocation after destroying an object
prefetchwith hot/fixed dst and cold src
memcpyCachedDst1000       122.30ns         memcpy 1000 bytes with  29.79ns    Prefetch instruction
rdtschot/fixed dst and cold src
memcpyCachedDst10000      355.23ns         memcpy 10000 bytes with hot/fixed dst 6.80nsand cold src
memcpyCold100 Read the fine-grain cycle counter segmentEntrySort       5372.18ms20ns    Sort a Segment full of avg.memcpy 100-byte Objectsbytes bywith agecold segmentIteratordst and src
memcpyCold1000     1.08ms    Iterate a Segment full of avg655.00ns 100-byte Objects sfence      memcpy 1000 bytes with cold dst and src
memcpyCold10000   1.47ns    Sfence instruction spinLock    3.18us         memcpy 16.34ns10000 bytes with  Acquire/release SpinLock
startStopTimercold dst and src
murmur3          15.03ns    Start and stop a Dispatch::Timer throwInt 10.28ns         128-bit MurmurHash3 (64-bit optimised) on 1.97us byte of data
murmur3 Throw an int throwIntNL             2.44us   49.72ns Throw an int in a function call throwException 128-bit MurmurHash3 hash (64-bit optimised) on 256 bytes 1.88usof data
objectPoolAlloc  Throw an Exception throwExceptionNL       229.60us16ns    Throw an Exception in a functionCost of callnew throwSwitchallocations from an ObjectPool (no destroys)
objectPoolRealloc      5.09us    Throw an4.48ns Exception using ClientException::throwException vectorPushPop     Cost of ObjectPool allocation after 3.86nsdestroying an object
pingConditionVar Push and pop a std::vector 

July 18, 2011

No Format
atomicIntCmpX    6.26ns  1.56us  AtomicInt::compareExchange atomicIntInc     6.05ns Round-trip ping with AtomicIntstd::inccondition_variable
prefetch atomicIntLoad    0.57ns    AtomicInt::load atomicIntStore   0.58ns    AtomicInt::store atomicIntXchg    635.06ns20ns    AtomicInt::exchange bMutexNoBlock    14.66nsPrefetch instruction
rdtsc Boost mutex lock/unlock (no blocking) cppAtomicExchg   5.93ns    Exchange method on a C++ atomic_int cppAtomicLoad    136.66ns67ns   Read a C++ atomic_int dispatchPoll  Read the  7.62ns    Dispatch::poll (no timers or pollers)
getThreadIdfine-grain cycle counter
segmentEntrySort            28.27ns14ms     Retrieve thread id via ThreadId::get
lfence        Sort a Segment full of avg. 100-byte Objects by age
segmentIterator    2.84ns    Lfence instruction lockInDispThrd   71.05ns97ms    Acquire/release Dispatch::Lock (in dispatch thread) lockNonDispThrdIterate a 252.79nsSegment full Acquire/release Dispatch::Lock (non-dispatch thread)
sfenceof avg. 100-byte Objects
sessionRefCount            1.50ns12.55ns       Sfence instruction spinLockCreate/delete SessionRef
serialize       16.46ns   Acquire/release SpinLock

Misc

Some 2.3GHz Xeon, 2011-08-23

No Format
atomicIntCmpX       1055.63ns34ns   AtomicInt::compareExchange atomicIntInc     8.71nscpuid instruction for serialize
AtomicInt::incsfence atomicIntLoad    0.44ns    AtomicInt::load atomicIntStore   0.43ns    AtomicInt::store atomicIntXchg    71.83ns39ns    AtomicInt::exchange bMutexNoBlock    42.26nsSfence instruction
spinLock Boost mutex lock/unlock (no blocking) cppAtomicExchg   7.97ns    Exchange method on a C++ atomic_int
cppAtomicLoad 18.90ns    7.69ns    Read a C++ atomic_int
cyclesToSeconds  10.42nsAcquire/release SpinLock
startStopTimer      Convert a rdtsc result to (double) seconds cyclesToNanos56.46ns    14.49ns   Convert a rdtscStart resultand to (uint64_t) nanoseconds
dispatchPollstop a Dispatch::Timer
spawnThread     20.41ns   Dispatch::poll (no timers or pollers) functionCall     38.05ns90us    Call a function that has notStart beenand inlinedstop getThreadIda thread
throwInt    3.04ns    Retrieve thread id via ThreadId::get lfence           42.79ns02us    Lfence instruction lockInDispThrd   6.10nsThrow an int
 Acquire/release Dispatch::Lock (in dispatch thread)
lockNonDispThrd  283.43ns  Acquire/release Dispatch::Lock (non-dispatch thread)
sfencethrowIntNL                  2.48us      3.60ns   Throw Sfencean instructionint spinLockin a function call
throwException     25.17ns   Acquire/release SpinLock throwInt    1.97us     3.32us    Throw an intException
throwIntNLthrowExceptionNL       4.54us    Throw  2.78us         Throw an intException in a function call
throwException
throwSwitch                 3.43us6.05us         Throw an Exception
throwExceptionNL 4.69us using ClientException::throwException
timeTrace     Throw an Exception in a function call throwSwitch      6.45us15.93ns         ThrowRecord an Exceptionevent using ClientException::throwException
 TimeTrace
unorderedMapCreate         72.63ns         Create+delete entry in unordered_map
unorderedMapLookup         13.07ns         Lookup in std::unordered_map<uint64_t, uint64_t>
vectorPushPop               3.80ns         Push and pop a std::vector

June 9, 2014

No Format
atomicIntCmpX          6.21ns    Atomic<int>::compareExchange
atomicIntInc           5.98ns    Atomic<int>::inc
atomicIntLoad          0.56ns    Atomic<int>::load
atomicIntStore         0.56ns    Atomic<int>::store
atomicIntXchg          5.97ns    Atomic<int>::exchange
bMutexNoBlock         14.77ns    std::mutex lock/unlock (no blocking)
bufferBasic           11.42ns    buffer create, add one chunk, delete
bufferBasicAlloc      14.62ns    buffer create, alloc block in chunk, delete
bufferBasicCopy       13.24ns    buffer create, copy small block, delete
bufferCopy            14.76ns    copy out 2 small chunks from buffer
bufferExtendChunk      5.69ns    buffer add onto existing chunk
bufferGetStart         3.34ns    Buffer::getStart
bufferIterator        22.57ns    iterate over buffer with 5 chunks
condPingPong           4.04us    std::condition_variable round-trip
cppAtomicExchg         5.87ns    Exchange method on a C++ atomic_int
cppAtomicLoad         13.50ns    Read a C++ atomic_int
cyclesToSeconds        6.97ns    Convert a rdtsc result to (double) seconds
cyclesToNanos          9.75ns    Convert a rdtsc result to (uint64_t) nanoseconds
dispatchPoll           8.63ns    Dispatch::poll (no timers or pollers)
div32                  4.86ns    32-bit integer division instruction
div64                 20.30ns    64-bit integer division instruction
functionCall           1.95ns    Call a function that has not been inlined
getThreadId            2.24ns    Retrieve thread id via ThreadId::get
hashTableLookup      128.86ns    Key lookup in a 1GB HashTable
hashTableLookupPf     78.07ns    Key lookup in a 1GB HashTable with prefetching
lfence                 2.82ns    Lfence instruction
lockInDispThrd         4.06ns    Acquire/release Dispatch::Lock (in dispatch thread)
lockNonDispThrd      214.54ns    Acquire/release Dispatch::Lock (non-dispatch thread)
memcpy100              8.78ns    Copy 100 bytes with memcpy
memcpy1000            37.37ns    Copy 1000 bytes with memcpy
memcpy10000          191.07ns    Copy 10000 bytes with memcpy
murmur3               10.31ns    128-bit MurmurHash3 (64-bit optimised) on 1 byte of data
murmur3               40.47ns    128-bit MurmurHash3 hash (64-bit optimised) on 256 bytes of data
objectPoolAlloc       27.86ns    Cost of new allocations from an ObjectPool (no destroys)
objectPoolRealloc      4.65ns    Cost of ObjectPool allocation after destroying an object
prefetch              34.72ns    Prefetch instruction
rdtsc                  6.70ns    Read the fine-grain cycle counter
segmentEntrySort       7.02ms    Sort a Segment full of avg. 100-byte Objects by age
segmentIterator        1.83ms    Iterate a Segment full of avg. 100-byte Objects
sfence                 1.39ns    Sfence instruction
spinLock              16.17ns    Acquire/release SpinLock
startStopTimer        14.76ns    Start and stop a Dispatch::Timer
spawnThread            9.89us    Start and stop a thread
throwInt               1.91us    Throw an int
throwIntNL             2.60us    Throw an int in a function call
throwException         1.84us    Throw an Exception
throwExceptionNL       2.58us    Throw an Exception in a function call
throwSwitch            5.46us    Throw an Exception using ClientException::throwException
vectorPushPop          3.72ns    Push and pop a std::vector

June 18, 2012

No Format
atomicIntCmpX          6.29ns    Atomic<int>::compareExchange
atomicIntInc           5.93ns    Atomic<int>::inc
atomicIntLoad          0.56ns    Atomic<int>::load
atomicIntStore         0.59ns    Atomic<int>::store
atomicIntXchg          5.93ns    Atomic<int>::exchange
bMutexNoBlock         14.92ns    std::mutex lock/unlock (no blocking)
cppAtomicExchg         5.93ns    Exchange method on a C++ atomic_int
cppAtomicLoad         13.76ns    Read a C++ atomic_int
cyclesToSeconds        7.04ns    Convert a rdtsc result to (double) seconds
cyclesToNanos          9.86ns    Convert a rdtsc result to (uint64_t) nanoseconds
dispatchPoll           8.17ns    Dispatch::poll (no timers or pollers)
functionCall           1.98ns    Call a function that has not been inlined
getThreadId            1.98ns    Retrieve thread id via ThreadId::get
hashTableLookup      168.76ns    Key lookup in a 1GB HashTable
hashTableLookupPf    143.19ns    Key lookup in a 1GB HashTable with prefetching
lfence                 2.90ns    Lfence instruction
lockInDispThrd         4.15ns    Acquire/release Dispatch::Lock (in dispatch thread)
lockNonDispThrd        4.05ns    Acquire/release Dispatch::Lock (non-dispatch thread)
murmur3               11.29ns    128-bit MurmurHash3 (64-bit optimised) on 1 byte of data
murmur3               41.17ns    128-bit MurmurHash3 hash (64-bit optimised) on 256 bytes of data
objectPoolAlloc      192.56ns    Cost of new allocations from an ObjectPool (no destroys)
objectPoolRealloc      4.42ns    Cost of ObjectPool allocation after destroying an object
prefetch              29.79ns    Prefetch instruction
rdtsc                  6.80ns    Read the fine-grain cycle counter
segmentEntrySort       5.18ms    Sort a Segment full of avg. 100-byte Objects by age
segmentIterator        1.08ms    Iterate a Segment full of avg. 100-byte Objects
sfence                 1.47ns    Sfence instruction
spinLock              16.34ns    Acquire/release SpinLock
startStopTimer        15.03ns    Start and stop a Dispatch::Timer
throwInt               1.97us    Throw an int
throwIntNL             2.44us    Throw an int in a function call
throwException         1.88us    Throw an Exception
throwExceptionNL       2.60us    Throw an Exception in a function call
throwSwitch            5.09us    Throw an Exception using ClientException::throwException
vectorPushPop          3.86ns    Push and pop a std::vector

July 18, 2011

No Format
atomicIntCmpX    6.26ns    AtomicInt::compareExchange
atomicIntInc     6.05ns    AtomicInt::inc
atomicIntLoad    0.57ns    AtomicInt::load
atomicIntStore   0.58ns    AtomicInt::store
atomicIntXchg    6.06ns    AtomicInt::exchange
bMutexNoBlock    14.66ns   Boost mutex lock/unlock (no blocking)
cppAtomicExchg   5.93ns    Exchange method on a C++ atomic_int
cppAtomicLoad    13.66ns   Read a C++ atomic_int
dispatchPoll     7.62ns    Dispatch::poll (no timers or pollers)
getThreadId      2.27ns    Retrieve thread id via ThreadId::get
lfence           2.84ns    Lfence instruction
lockInDispThrd   7.05ns    Acquire/release Dispatch::Lock (in dispatch thread)
lockNonDispThrd  252.79ns  Acquire/release Dispatch::Lock (non-dispatch thread)
sfence           1.50ns    Sfence instruction
spinLock         16.46ns   Acquire/release SpinLock

Misc

Some 2.3GHz Xeon, 2011-08-23

No Format
atomicIntCmpX    10.63ns   AtomicInt::compareExchange
atomicIntInc     8.71ns    AtomicInt::inc
atomicIntLoad    0.44ns    AtomicInt::load
atomicIntStore   0.43ns    AtomicInt::store
atomicIntXchg    7.83ns    AtomicInt::exchange
bMutexNoBlock    42.26ns   Boost mutex lock/unlock (no blocking)
cppAtomicExchg   7.97ns    Exchange method on a C++ atomic_int
cppAtomicLoad    7.69ns    Read a C++ atomic_int
cyclesToSeconds  10.42ns   Convert a rdtsc result to (double) seconds
cyclesToNanos    14.49ns   Convert a rdtsc result to (uint64_t) nanoseconds
dispatchPoll     20.41ns   Dispatch::poll (no timers or pollers)
functionCall     3.05ns    Call a function that has not been inlined
getThreadId      3.04ns    Retrieve thread id via ThreadId::get
lfence           4.79ns    Lfence instruction
lockInDispThrd   6.10ns    Acquire/release Dispatch::Lock (in dispatch thread)
lockNonDispThrd  283.43ns  Acquire/release Dispatch::Lock (non-dispatch thread)
sfence           3.60ns    Sfence instruction
spinLock         25.17ns   Acquire/release SpinLock
throwInt         3.32us    Throw an int
throwIntNL       4.54us    Throw an int in a function call
throwException   3.43us    Throw an Exception
throwExceptionNL 4.69us    Throw an Exception in a function call
throwSwitch      6.45us    Throw an Exception using ClientException::throwException



bufferAppendCopy50          8.93ns         appendCopy 50 bytes to a bufferbufferAppendCopy100         8.47ns         appendCopy 100 bytes to a bufferbufferAppendCopy250        16.74ns         appendCopy 250 bytes to a bufferbufferAppendCopy500        20.90ns         appendCopy 500 bytes to a bufferbufferAppendExternal1       8.13ns         appendExternal 1 byte to a bufferbufferAppendExternal50      8.15ns         appendExternal 50 bytes to a bufferbufferAppendExternal100     7.84ns         appendExternal 100 bytes to a bufferbufferAppendExternal250     7.76ns         appendExternal 250 bytes to a bufferbufferAppendExternal500     8.25ns         appendExternal 500 bytes to a bufferbufferBasic                10.88ns         buffer create, add one chunk, deletebufferBasicAlloc           12.29ns         buffer create, alloc block in chunk, deletebufferBasicCopy            12.02ns         buffer create, copy small block, deletebufferCopy                 14.25ns         copy out 2 small chunks from bufferbufferExtendChunk           4.69ns         buffer add onto existing chunkbufferGetStart              3.41ns         Buffer::getStartbufferConstruct             5.90ns         buffer stack allocationbufferReset                 4.79ns         Buffer::resetbufferCopyIterator2         6.46ns         buffer iterate over 2 copied chunks, accessing 1 byte eachbufferCopyIterator5         8.71ns         buffer iterate over 5 copied chunks, accessing 1 byte eachbufferExternalIterator2     8.96ns         buffer iterate over 2 external chunks, accessing 1 byte eachbufferExternalIterator5    23.46ns         buffer iterate over 5 external chunks, accessing 1 byte eachcondPingPong                3.09us         std::condition_variable round-tripcppAtomicExchg              6.64ns         Exchange method on a C++ atomic_intcppAtomicLoad               0.60ns         Read a C++ atomic_intcyclesToSeconds             6.30ns         Convert a rdtsc result to (double) secondscyclesToNanos               6.32ns         Convert a rdtsc result to (uint64_t) nanosecondsdispatchPoll                9.36ns         Dispatch::poll (no timers or pollers)div32                       4.79ns         32-bit integer division instructiondiv64                      20.22ns         64-bit integer division instructionfunctionCall                1.74ns         Call a function that has not been inlinedgenerateRandomNumber       17.21ns         Call to randomNumberGenerator(x)genRandomString             1.49us         Generate a random 100-byte valuegetThreadId                 1.73ns         Retrieve thread id via ThreadId::getgetThreadIdSyscall         41.04ns         Retrieve kernel thread id using syscallhashTableLookup           123.41ns         Key lookup in a 1GB HashTablehashTableLookupPf          70.57ns         Key lookup in a 1GB HashTable with prefetchinglfence                      3.00ns         Lfence instructionlockInDispThrd              3.52ns         Acquire/release Dispatch::Lock (in dispatch thread)lockNonDispThrd           324.23ns         Acquire/release Dispatch::Lock (non-dispatch thread)mapCreate                  21.15ns         Create+delete entry in std::mapmapLookup                  14.87ns         Lookup in std::map<uint64_t, uint64_t>memcpyCached100            14.52ns         memcpy 100 bytes with hot/fixed dst and srcmemcpyCached1000           21.79ns         memcpy 1000 bytes with hot/fixed dst and srcmemcpyCached10000         185.40ns         memcpy 10000 bytes with hot/fixed dst and srcmemcpyCachedDst100         93.36ns         memcpy 100 bytes with hot/fixed dst and cold srcmemcpyCachedDst1000       122.05ns         memcpy 1000 bytes with hot/fixed dst and cold srcmemcpyCachedDst10000      356.61ns         memcpy 10000 bytes with hot/fixed dst and cold srcmemcpyCold100             377.18ns         memcpy 100 bytes with cold dst and srcmemcpyCold1000            659.35ns         memcpy 1000 bytes with cold dst and srcmemcpyCold10000             3.18us         memcpy 10000 bytes with cold dst and srcmurmur3                    10.96ns         128-bit MurmurHash3 (64-bit optimised) on 1 byte of datamurmur3                    50.44ns         128-bit MurmurHash3 hash (64-bit optimised) on 256 bytes of dataobjectPoolAlloc            29.52ns         Cost of new allocations from an ObjectPool (no destroys)objectPoolRealloc           4.57ns         Cost of ObjectPool allocation after destroying an objectpingConditionVar            1.61us         Round-trip ping with std::condition_variableprefetch                   35.85ns         Prefetch instructionqueueEstimator              2.23ns         Recompute # bytes outstanding in queuerdtsc                       6.74ns         Read the fine-grain cycle countersegmentEntrySort            8.38ms         Sort a Segment full of avg. 100-byte Objects by agesegmentIterator             2.09ms         Iterate a Segment full of avg. 100-byte ObjectssessionRefCount            12.59ns         Create/delete SessionRefserialize                  55.41ns         cpuid instruction for serializesfence                      1.41ns         Sfence instructionspinLock                    9.60ns         Acquire/release SpinLockstartStopTimer             33.43ns         Start and stop a Dispatch::TimerspawnThread                 8.89us         Start and stop a threadthrowInt                    2.07us         Throw an intthrowIntNL                  2.55us         Throw an int in a function callthrowException              2.03us         Throw an ExceptionthrowExceptionNL            2.81us         Throw an Exception in a function callthrowSwitch                 6.04us         Throw an Exception using ClientException::throwExceptiontimeTrace                   7.56ns         Record an event using TimeTraceunorderedMapCreate         79.05ns         Create+delete entry in unordered_mapunorderedMapLookup         14.62ns         Lookup in std::unordered_map<uint64_t, uint64_t>vectorPushPop               3.90ns         Push and pop a std::vector