clusterperf January 2016 infrc

clusterperf output as measured on January 6, 2016 (rc1-rc20, infrc transport)

Recent changes that may have affected performance:

  • Added support for linearizability, which affects the critical path of some RPCs
  • Numerous other small changes over the last year (e.g. new RPC level mechanism)

 

basic.read100          4.6 us     read random 100B object (30B key) median
basic.read100.min      4.2 us     read random 100B object (30B key) minimum
basic.read100.9        5.3 us     read random 100B object (30B key) 90%
basic.read100.99       5.9 us     read random 100B object (30B key) 99%
basic.read100.999      8.2 us     read random 100B object (30B key) 99.9%
basic.readBw100       19.8 MB/s   bandwidth reading 100B objects (30B key)
basic.read1K           6.3 us     read random 1KB object (30B key) median
basic.read1K.min       5.4 us     read random 1KB object (30B key) minimum
basic.read1K.9         7.0 us     read random 1KB object (30B key) 90%
basic.read1K.99        7.7 us     read random 1KB object (30B key) 99%
basic.read1K.999      10.0 us     read random 1KB object (30B key) 99.9%
basic.readBw1K       146.8 MB/s   bandwidth reading 1KB objects (30B key)
basic.read10K          9.4 us     read random 10KB object (30B key) median
basic.read10K.min      8.3 us     read random 10KB object (30B key) minimum
basic.read10K.9       10.4 us     read random 10KB object (30B key) 90%
basic.read10K.99      11.3 us     read random 10KB object (30B key) 99%
basic.read10K.999     13.1 us     read random 10KB object (30B key) 99.9%
basic.readBw10K      990.1 MB/s   bandwidth reading 10KB objects (30B key)
basic.read100K        42.1 us     read random 100KB object (30B key) median
basic.read100K.min    35.7 us     read random 100KB object (30B key) minimum
basic.read100K.9      43.3 us     read random 100KB object (30B key) 90%
basic.read100K.99     44.4 us     read random 100KB object (30B key) 99%
basic.read100K.999    47.3 us     read random 100KB object (30B key) 99.9%
basic.readBw100K       2.2 GB/s   bandwidth reading 100KB objects (30B key)
basic.read1M         357.4 us     read random 1MB object (30B key) median
basic.read1M.min     324.0 us     read random 1MB object (30B key) minimum
basic.read1M.9       363.1 us     read random 1MB object (30B key) 90%
basic.read1M.99      365.7 us     read random 1MB object (30B key) 99%
basic.read1M.999     419.5 us     read random 1MB object (30B key) 99.9%
basic.readBw1M         2.6 GB/s   bandwidth reading 1MB objects (30B key)
basic.write100        14.0 us     write random 100B object (30B key) median
basic.write100.min    12.8 us     write random 100B object (30B key) minimum
basic.write100.9      15.1 us     write random 100B object (30B key) 90%
basic.write100.99     17.7 us     write random 100B object (30B key) 99%
basic.write100.999   101.7 us     write random 100B object (30B key) 99.9%
basic.writeBw100       6.6 MB/s   bandwidth writing 100B objects (30B key)
basic.write1K         16.8 us     write random 1KB object (30B key) median
basic.write1K.min     15.4 us     write random 1KB object (30B key) minimum
basic.write1K.9       17.9 us     write random 1KB object (30B key) 90%
basic.write1K.99      21.0 us     write random 1KB object (30B key) 99%
basic.write1K.999    105.2 us     write random 1KB object (30B key) 99.9%
basic.writeBw1K       55.4 MB/s   bandwidth writing 1KB objects (30B key)
basic.write10K        34.1 us     write random 10KB object (30B key) median
basic.write10K.min    31.7 us     write random 10KB object (30B key) minimum
basic.write10K.9      35.9 us     write random 10KB object (30B key) 90%
basic.write10K.99     40.5 us     write random 10KB object (30B key) 99%
basic.write10K.999   130.9 us     write random 10KB object (30B key) 99.9%
basic.writeBw10K     216.2 MB/s   bandwidth writing 10KB objects (30B key)
basic.write100K      222.9 us     write random 100KB object (30B key) median
basic.write100K.min  204.7 us     write random 100KB object (30B key) minimum
basic.write100K.9    230.0 us     write random 100KB object (30B key) 90%
basic.write100K.99   323.1 us     write random 100KB object (30B key) 99%
basic.write100K.999   25.4 ms     write random 100KB object (30B key) 99.9%
basic.writeBw100K    220.4 MB/s   bandwidth writing 100KB objects (30B key)
basic.write1M          2.1 ms     write random 1MB object (30B key) median
basic.write1M.min      2.0 ms     write random 1MB object (30B key) minimum
basic.write1M.9        2.4 ms     write random 1MB object (30B key) 90%
basic.write1M.99      27.8 ms     write random 1MB object (30B key) 99%
basic.writeBw1M      215.9 MB/s   bandwidth writing 1MB objects (30B key)
# RAMCloud multiRead performance for 100 B objects with 30 byte keys
# located on a single master.
# Generated by 'clusterperf.py multiRead_oneMaster'
#
# Num Objs    Num Masters    Objs/Master    Latency (us)    Latency/Obj (us)
#----------------------------------------------------------------------------
         1              1              1            5.2               5.17
         2              1              2            6.5               3.25
         3              1              3            7.0               2.33
         4              1              4            8.2               2.06
         5              1              5            8.9               1.77
         6              1              6            9.8               1.64
         7              1              7           10.5               1.50
         8              1              8           10.9               1.36
         9              1              9           11.4               1.26
        10              1             10           11.9               1.19
        20              1             20           17.9               0.90
        30              1             30           19.3               0.64
        40              1             40           22.7               0.57
        50              1             50           23.2               0.46
        60              1             60           26.1               0.43
        70              1             70           30.4               0.43
        80              1             80           32.7               0.41
        90              1             90           32.7               0.36
       100              1            100           38.3               0.38
       200              1            200           55.1               0.28
       300              1            300           78.6               0.26
       400              1            400          102.3               0.26
       500              1            500          125.9               0.25
       600              1            600          151.3               0.25
       700              1            700          174.8               0.25
       800              1            800          199.7               0.25
       900              1            900          226.6               0.25
      1000              1           1000          249.4               0.25
      2000              1           2000          489.2               0.24
      3000              1           3000          734.8               0.24
      4000              1           4000          969.7               0.24
      5000              1           5000         1206.1               0.24
# RAMCloud multiRead performance for 100 B objects with 30 byte keys
# with one object located on each master.
# Generated by 'clusterperf.py multiRead_oneObjectPerMaster'
#
# Num Objs    Num Masters    Objs/Master    Latency (us)    Latency/Obj (us)
#----------------------------------------------------------------------------
         1              1              1            5.2               5.18
         2              2              1            6.2               3.10
         3              3              1            7.2               2.39
         4              4              1            8.0               2.01
         5              5              1            9.1               1.82
         6              6              1           10.0               1.66
         7              7              1           10.7               1.53
         8              8              1           12.0               1.49
         9              9              1           13.1               1.46
        10             10              1           14.0               1.40
        11             11              1           17.2               1.56
        12             12              1           18.2               1.51
        13             13              1           19.0               1.46
        14             14              1           19.6               1.40
        15             15              1           21.8               1.45
        16             16              1           23.1               1.45
        17             17              1           24.5               1.44
        18             18              1           26.9               1.49
        19             19              1           27.4               1.44
# RAMCloud multi-read throughput of a single server with a
# varying number of clients issuing 80-object multi-reads on
# randomly-chosen 100-byte objects with 30-byte keys
# Generated by 'clusterperf.py multiReadThroughput'
#
# numClients   throughput     worker     dispatch
#             (kreads/sec)    utiliz.     utiliz.
#------------------------------------------------
    1             2453         1.413      0.051
    2             4375         2.606      0.176
    3             4668         2.902      0.225
    4             5042         2.929      0.227
    5             5045         2.938      0.215
    6             5099         2.929      0.221
    7             4864         2.929      0.221
    8             5248         2.928      0.261
    9             5126         2.913      0.266
   10             4333         2.907      0.232
   11             4478         2.910      0.241
   12             4787         2.915      0.260
   13             4490         2.916      0.243
   14             4576         2.905      0.272
# RAMCloud multiWrite performance for 100 B objects with 30 byte keys
# located on a single master.
# Generated by 'clusterperf.py multiWrite_oneMaster'
#
# Num Objs    Num Masters    Objs/Master    Latency (us)    Latency/Obj (us)
#----------------------------------------------------------------------------
         1              1              1           13.6              13.59
         2              1              2           15.8               7.92
         3              1              3           17.1               5.71
         4              1              4           18.7               4.68
         5              1              5           20.4               4.08
         6              1              6           22.0               3.66
         7              1              7           23.2               3.31
         8              1              8           24.9               3.11
         9              1              9           26.1               2.90
        10              1             10           27.8               2.78
        20              1             20           42.8               2.14
        30              1             30           51.8               1.73
        40              1             40           68.9               1.72
        50              1             50          306.5               6.13
        60              1             60          234.0               3.90
        70              1             70           94.1               1.34
        80              1             80          362.4               4.53
        90              1             90          120.6               1.34
       100              1            100          355.8               3.56
       200              1            200          214.4               1.07
       300              1            300         1744.1               5.81
       400              1            400          435.4               1.09
       500              1            500         3480.8               6.96
       600              1            600          590.7               0.98
       700              1            700         8075.3              11.54
       800              1            800          813.7               1.02
       900              1            900         7550.7               8.39
      1000              1           1000         1075.9               1.08
      2000              1           2000        11156.6               5.58
      3000              1           3000         3167.5               1.06
      4000              1           4000        34111.5               8.53
      5000              1           5000         5310.6               1.06
# Cumulative distribution of time for a single client to read a
# random 100-byte object from a single server.  Each line indicates
# that a given fraction of all reads took at most a given time
# to complete.
# Generated by 'clusterperf.py readDist'
#
# Time (usec)  Cum. Fraction
#---------------------------
    0.00       0.000
    4.26       0.000
    4.41       0.010
    4.43       0.020
    4.43       0.030
    4.44       0.040
    4.44       0.050
    4.45       0.060
    4.45       0.070
    4.45       0.080
    4.46       0.090
    4.46       0.100
    4.46       0.110
    4.46       0.120
    4.47       0.130
    4.47       0.140
    4.47       0.150
    4.47       0.160
    4.47       0.170
    4.47       0.180
    4.48       0.190
    4.48       0.200
    4.48       0.210
    4.48       0.220
    4.48       0.230
    4.48       0.240
    4.49       0.250
    4.49       0.260
    4.49       0.270
    4.49       0.280
    4.50       0.290
    4.50       0.300
    4.50       0.310
    4.50       0.320
    4.50       0.330
    4.51       0.340
    4.51       0.350
    4.51       0.360
    4.51       0.370
    4.51       0.380
    4.52       0.390
    4.52       0.400
    4.52       0.410
    4.52       0.420
    4.52       0.430
    4.52       0.440
    4.53       0.450
    4.53       0.460
    4.53       0.470
    4.53       0.480
    4.53       0.490
    4.54       0.500
    4.54       0.510
    4.54       0.520
    4.54       0.530
    4.55       0.540
    4.55       0.550
    4.55       0.560
    4.55       0.570
    4.56       0.580
    4.56       0.590
    4.56       0.600
    4.56       0.610
    4.57       0.620
    4.57       0.630
    4.57       0.640
    4.58       0.650
    4.58       0.660
    4.59       0.670
    4.59       0.680
    4.60       0.690
    4.60       0.700
    4.60       0.710
    4.61       0.720
    4.62       0.730
    4.62       0.740
    4.63       0.750
    4.64       0.760
    4.64       0.770
    4.65       0.780
    4.66       0.790
    4.68       0.800
    4.70       0.810
    4.72       0.820
    4.77       0.830
    5.01       0.840
    5.06       0.850
    5.09       0.860
    5.12       0.870
    5.15       0.880
    5.17       0.890
    5.20       0.900
    5.22       0.910
    5.26       0.920
    5.36       0.930
    5.64       0.940
    5.70       0.950
    5.73       0.960
    5.76       0.970
    5.80       0.980
    5.87       0.990
    8.34       0.999
   99.88       0.9999
  115.64       1.000
# RAMCloud read throughput of a single server with a varying
# number of clients issuing individual reads on randomly
# chosen 100-byte objects with 30-byte keys
# Generated by 'clusterperf.py readThroughput'
#
# numClients   throughput     worker     dispatch
#             (kreads/sec)    utiliz.     utiliz.
#------------------------------------------------
    1              213         0.154      0.032
    2              404         0.325      0.141
    3              595         0.480      0.330
    4              757         0.640      0.602
    5              873         0.762      0.809
    6              952         0.838      0.928
    7             1005         0.892      0.976
    8             1033         0.917      0.999
    9             1046         0.923      1.000
   10             1061         0.934      1.000
   11             1072         0.944      1.000
   12             1068         0.940      1.000
   13             1065         0.937      1.000
   14             1056         0.929      1.000
# RAMCloud read performance for 100 B objects
# with keys of various lengths.
# Generated by 'clusterperf.py readVaryingKeyLength'
#
# Key Length      Latency (us)     Bandwidth (MB/s)
#----------------------------------------------------------------------------
           1              4.3                22.4
           5              4.3                23.3
          10              4.3                24.4
          15              4.3                25.3
          20              4.3                26.4
          25              4.4                27.4
          30              4.3                29.0
          35              4.4                29.1
          40              4.4                30.1
          45              4.4                31.2
          50              4.4                32.2
          55              4.5                33.2
          60              4.5                34.3
          65              4.4                35.4
          70              4.5                36.2
          75              4.5                36.8
          80              4.5                38.0
          85              4.5                38.8
          90              4.5                39.9
          95              4.5                40.9
         100              4.6                41.4
         200              5.4                53.5
         300              5.5                69.1
         400              5.5                87.4
         500              5.6               101.5
         600              5.8               115.6
         700              5.9               129.6
         800              6.0               142.5
         900              6.2               154.1
        1000              6.3               166.0
        2000              7.0               285.2
        3000              7.8               381.3
        4000              8.5               460.3
        5000              9.4               518.1
        6000             10.1               574.0
        7000             10.8               625.9
        8000             11.5               669.6
        9000             12.5               696.5
       10000             14.1               684.0
       20000             22.2               862.1
       30000             30.5               942.6
       40000             41.2               927.6
       50000             49.9               957.2
       60000             58.4               981.0
# RAMCloud write performance for 100 B objects
# with keys of various lengths.
# Generated by 'clusterperf.py writeVaryingKeyLength'
#
# Key Length      Latency (us)     Bandwidth (MB/s)
#----------------------------------------------------------------------------
           1             14.2                 6.8
           5             14.1                 7.1
          10             14.2                 7.4
          15             14.2                 7.7
          20             14.3                 8.0
          25             14.3                 8.3
          30             14.3                 8.7
          35             14.4                 9.0
          40             14.4                 9.3
          45             14.9                 9.3
          50             15.0                 9.6
          55             15.0                 9.9
          60             15.1                10.1
          65             14.9                10.6
          70             14.8                10.9
          75             14.9                11.2
          80             14.9                11.6
          85             14.9                11.8
          90             15.1                12.0
          95             15.1                12.3
         100             15.1                12.6
         200             15.7                18.2
         300             16.1                23.6
         400             16.8                28.4
         500             17.1                33.4
         600             17.6                37.9
         700             18.7                40.9
         800             19.7                43.6
         900             20.5                46.6
        1000             20.7                50.6
        2000             24.8                80.7
        3000             29.3               100.8
        4000             33.9               115.3
        5000             38.6               126.0
        6000             43.2               134.6
        7000             47.0               144.2
        8000             48.0               161.1
        9000             52.9               164.0
       10000             57.3               168.0
       20000            109.6               174.8
       30000            158.3               181.4
       40000            211.1               181.2
       50000            262.2               182.3
       60000            312.0               183.7
# RAMCLOUD index write, overwrite, lookup+readHashes, and IndexLookup class performance with a varying number of objects
# and a B+ tree fanout of 16. All keys are 30 bytes and the value of the object is fixed to be 100 bytes. Read and Overwrite 
# latencies are measured by randomly reading and overwriting 1000 objects already inserted into the index at size n. 
# In a similar fashion, write latencies are measured by deleting an existing object and then immediately re-writing. 
# All latency measurements are printed as 10th percentile/ median/ 90th percentile.
#
# Generated by 'clusterperf.py indexBasic'
#
#       n       write latency(us)   overwrite latency(us)         hash lookup(us)         lookup+read(us)         IndexLookup(us)   IndexLookup overhead
#--------------------------------------------------------------------------------------------------------------------------------------------------------
        1      27.1/  28.3/  30.5      29.0/  30.0/  31.5       4.8/   4.9/   6.0       9.5/   9.6/  10.8       9.6/   9.7/  10.9     0.14/  0.15/  0.15
       10      29.4/  30.6/  32.3      29.7/  30.7/  32.4       4.8/   4.9/   5.9       9.5/   9.6/  10.8       9.6/   9.7/  10.9     0.13/  0.14/  0.13
      100      29.8/  30.9/  33.0      30.5/  31.6/  37.5       5.2/   5.3/   6.4       9.9/  10.1/  11.3      10.0/  10.2/  11.3     0.08/  0.09/  0.05
     1000      30.3/  31.5/  33.5      30.9/  32.0/  41.4       5.6/   5.8/   6.8      10.4/  10.6/  11.7      10.4/  10.6/  11.7     0.03/ -0.04/  0.00
    10000      30.6/  31.7/  33.6      31.1/  32.1/  40.5       6.0/   6.1/   7.2      10.7/  11.0/  12.1      10.8/  10.9/  12.1     0.04/ -0.04/ -0.02
   100000      31.2/  32.3/  34.0      31.6/  32.7/  37.3       6.4/   6.6/   7.6      11.3/  11.6/  12.7      11.2/  11.4/  12.5    -0.09/ -0.19/ -0.14
  1000000      31.7/  32.8/  34.8      32.1/  33.1/  37.1       6.8/   6.9/   8.0      11.7/  12.0/  13.1      11.6/  11.8/  12.8    -0.16/ -0.29/ -0.29
# RAMCloud write/overwrite performance for random object
# insertion with a varying number of secondary keys. The
# IndexBtree fanout is 16 and the size of the table is fixed
# at 1000 objects, each with a 100 byte value, a 30 byte primary
# key, and an additional 30 bytes per secondary key. After
# filling the table with 1000 objects in a random order, this test
# will overwrite, erase, and re-write 1000 pre-exiting objects to
# measure latency. The latency measurements are printed as
# 10th percentile/ median/ 90th percentile
#
# Generated by 'clusterperf.py indexMultiple'
#
# Sec. keys/obj        write latency (us)        overwrite latency (us)
#----------------------------------------------------------------------
              0        12.6/  13.2/  14.4           13.2/  13.7/  14.7
              1        29.7/  30.8/  32.5           30.5/  31.4/  32.8
              2        32.3/  33.5/  35.4           32.8/  33.8/  35.2
              3        32.6/  33.6/  35.4           33.1/  34.1/  35.7
              4        33.5/  34.6/  36.7           34.0/  35.0/  36.4
              5        34.6/  35.9/  38.3           35.0/  36.2/  38.2
              6        37.0/  38.4/  41.5           37.5/  38.9/  41.3
              7        37.0/  38.1/  40.8           37.2/  38.3/  40.5
              8        39.5/  40.9/  44.2           40.1/  41.4/  44.3
              9        39.7/  41.2/  45.0           40.1/  41.5/  44.6
             10        40.8/  42.1/  45.1           40.9/  41.9/  44.5
# RAMCloud transaction performance for 100 B objects with 30 byte keys
# located on a single master.
# Generated by 'clusterperf.py transaction_oneMaster'
#
# Num Objs    Num Masters    Objs/Master  WriteObjs/Master  Latency (us)    Latency/Obj (us)
#------------------------------------------------------------------------------------------
         1              1              1              1           19.2              19.19
         2              1              2              2           20.8              10.39
         3              1              3              3           24.1               8.04
         4              1              4              4           27.9               6.97
         5              1              5              5           31.3               6.25
         6              1              6              6           34.6               5.77
         7              1              7              7           37.6               5.38
         8              1              8              8           42.0               5.25
         9              1              9              9           47.0               5.22
        10              1             10             10           51.8               5.18
        20              1             20             20           85.3               4.27
        30              1             30             30          122.3               4.08