clusterperf January 2015

clusterperf output as measured on January 30, 2015 (rc1-rc20)

Recent changes that may have affected performance:

  • Added DispatchExec, which saves 1us on writes
  • Reduced Multi-op scheduler batch size
  • Indexing Benchmarks included

 

basic.read100          4.8 us     read random 100B object (30B key) median
basic.read100.min      4.4 us     read random 100B object (30B key) minimum
basic.read100.9        5.5 us     read random 100B object (30B key) 90%
basic.read100.99       7.0 us     read random 100B object (30B key) 99%
basic.read100.999      8.8 us     read random 100B object (30B key) 99.9%
basic.readBw100       19.1 MB/s   bandwidth reading 100B objects (30B key)
basic.read1K           6.9 us     read random 1KB object (30B key) median
basic.read1K.min       6.0 us     read random 1KB object (30B key) minimum
basic.read1K.9         7.9 us     read random 1KB object (30B key) 90%
basic.read1K.99        8.8 us     read random 1KB object (30B key) 99%
basic.read1K.999      11.7 us     read random 1KB object (30B key) 99.9%
basic.readBw1K       132.5 MB/s   bandwidth reading 1KB objects (30B key)
basic.read10K         10.1 us     read random 10KB object (30B key) median
basic.read10K.min      9.0 us     read random 10KB object (30B key) minimum
basic.read10K.9       11.1 us     read random 10KB object (30B key) 90%
basic.read10K.99      12.2 us     read random 10KB object (30B key) 99%
basic.read10K.999     34.6 us     read random 10KB object (30B key) 99.9%
basic.readBw10K      917.6 MB/s   bandwidth reading 10KB objects (30B key)
basic.read100K        42.9 us     read random 100KB object (30B key) median
basic.read100K.min    36.7 us     read random 100KB object (30B key) minimum
basic.read100K.9      44.2 us     read random 100KB object (30B key) 90%
basic.read100K.99     51.4 us     read random 100KB object (30B key) 99%
basic.read100K.999    92.9 us     read random 100KB object (30B key) 99.9%
basic.readBw100K       2.2 GB/s   bandwidth reading 100KB objects (30B key)
basic.read1M         357.7 us     read random 1MB object (30B key) median
basic.read1M.min     322.6 us     read random 1MB object (30B key) minimum
basic.read1M.9       363.4 us     read random 1MB object (30B key) 90%
basic.read1M.99      366.4 us     read random 1MB object (30B key) 99%
basic.read1M.999     411.0 us     read random 1MB object (30B key) 99.9%
basic.readBw1M         2.6 GB/s   bandwidth reading 1MB objects (30B key)
basic.write100        14.0 us     write random 100B object (30B key) median
basic.write100.min    12.5 us     write random 100B object (30B key) minimum
basic.write100.9      15.1 us     write random 100B object (30B key) 90%
basic.write100.99     22.9 us     write random 100B object (30B key) 99%
basic.write100.999   106.6 us     write random 100B object (30B key) 99.9%
basic.writeBw100       6.5 MB/s   bandwidth writing 100B objects (30B key)
basic.write1K         18.5 us     write random 1KB object (30B key) median
basic.write1K.min     16.7 us     write random 1KB object (30B key) minimum
basic.write1K.9       20.2 us     write random 1KB object (30B key) 90%
basic.write1K.99      84.8 us     write random 1KB object (30B key) 99%
basic.write1K.999    159.0 us     write random 1KB object (30B key) 99.9%
basic.writeBw1K       46.2 MB/s   bandwidth writing 1KB objects (30B key)
basic.write10K        34.6 us     write random 10KB object (30B key) median
basic.write10K.min    31.6 us     write random 10KB object (30B key) minimum
basic.write10K.9      37.2 us     write random 10KB object (30B key) 90%
basic.write10K.99    133.0 us     write random 10KB object (30B key) 99%
basic.write10K.999   273.0 us     write random 10KB object (30B key) 99.9%
basic.writeBw10K     247.9 MB/s   bandwidth writing 10KB objects (30B key)
basic.write100K      227.7 us     write random 100KB object (30B key) median
basic.write100K.min  209.8 us     write random 100KB object (30B key) minimum
basic.write100K.9    261.4 us     write random 100KB object (30B key) 90%
basic.write100K.99   391.6 us     write random 100KB object (30B key) 99%
basic.write100K.999  475.8 us     write random 100KB object (30B key) 99.9%
basic.writeBw100K    401.9 MB/s   bandwidth writing 100KB objects (30B key)
basic.write1M          2.2 ms     write random 1MB object (30B key) median
basic.write1M.min      2.1 ms     write random 1MB object (30B key) minimum
basic.write1M.9        2.3 ms     write random 1MB object (30B key) 90%
basic.write1M.99       2.4 ms     write random 1MB object (30B key) 99%
basic.write1M.999      2.6 ms     write random 1MB object (30B key) 99.9%
basic.writeBw1M      428.1 MB/s   bandwidth writing 1MB objects (30B key)

# RAMCloud multiRead performance for 100 B objects with 30 byte keys
# located on a single master.
# Generated by 'clusterperf.py multiRead_oneMaster'
#
# Num Objs    Num Masters    Objs/Master    Latency (us)    Latency/Obj (us)
#----------------------------------------------------------------------------
         1              1              1            5.4               5.42
         2              1              2            6.8               3.41
         3              1              3            8.0               2.67
         4              1              4            9.4               2.35
         5              1              5            9.9               1.98
         6              1              6           11.3               1.88
         7              1              7           11.5               1.64
         8              1              8           13.0               1.62
         9              1              9           13.0               1.45
        10              1             10           13.5               1.35
        20              1             20           20.0               1.00
        30              1             30           22.0               0.73
        40              1             40           25.9               0.65
        50              1             50           28.9               0.58
        60              1             60           30.2               0.50
        70              1             70           31.8               0.45
        80              1             80           34.1               0.43
        90              1             90           36.1               0.40
       100              1            100           39.2               0.39
       200              1            200           68.3               0.34
       300              1            300           87.2               0.29
       400              1            400          111.1               0.28
       500              1            500          141.9               0.28
       600              1            600          169.8               0.28
       700              1            700          200.9               0.29
       800              1            800          228.1               0.29
       900              1            900          255.7               0.28
      1000              1           1000          271.8               0.27
      2000              1           2000          526.3               0.26
      3000              1           3000          781.7               0.26
      4000              1           4000         1059.6               0.26
      5000              1           5000         1327.9               0.27

 
# RAMCloud multiRead performance for 100 B objects with 30 byte keys
# with one object located on each master.
# Generated by 'clusterperf.py multiRead_oneObjectPerMaster'
#
# Num Objs    Num Masters    Objs/Master    Latency (us)    Latency/Obj (us)
#----------------------------------------------------------------------------
         1              1              1            5.6               5.63
         2              2              1            6.6               3.28
         3              3              1            8.0               2.66
         4              4              1            8.7               2.19
         5              5              1            9.8               1.96
         6              6              1           12.8               2.13
         7              7              1           12.5               1.79
         8              8              1           13.4               1.68
         9              9              1           15.2               1.69
        10             10              1           15.7               1.57
        11             11              1           20.1               1.82
        12             12              1           24.0               2.00
        13             13              1           25.0               1.92
        14             14              1           24.9               1.78
        15             15              1           25.7               1.71
        16             16              1           28.0               1.75
        17             17              1           29.5               1.73
        18             18              1           31.5               1.75
        19             19              1           32.9               1.73


# RAMCloud multi-read throughput of a single server with a
# varying number of clients issuing 80-object multi-reads on
# randomly-chosen 100-byte objects with 30-byte keys
# Generated by 'clusterperf.py multiReadThroughput'
#
# numClients   throughput     worker utiliz.
#              (kreads/sec)
#-------------------------------------------
    1             2335           1.283
    2             3854           2.539
    3             4591           2.850
    4             4806           2.845
    5             4789           2.902
    6             4688           2.860
    7             4731           2.902
    8             4446           2.838
    9             4811           2.805
   10             4503           2.819
   11             4228           2.816
   12             4217           2.800
   13             3436           2.457
   14             4453           2.763

 
# RAMCloud multiWrite performance for 100 B objects with 30 byte keys
# located on a single master.
# Generated by 'clusterperf.py multiWrite_oneMaster'
#
# Num Objs    Num Masters    Objs/Master    Latency (us)    Latency/Obj (us)
#----------------------------------------------------------------------------
         1              1              1           14.4              14.45
         2              1              2           18.3               9.13
         3              1              3           20.0               6.65
         4              1              4           21.5               5.38
         5              1              5           22.9               4.59
         6              1              6           26.0               4.34
         7              1              7           28.6               4.08
         8              1              8           28.1               3.51
         9              1              9           31.0               3.44
        10              1             10           30.4               3.04
        20              1             20           46.0               2.30
        30              1             30           57.3               1.91
        40              1             40           75.4               1.88
        50              1             50           77.4               1.55
        60              1             60         1035.3              17.25
        70              1             70          102.5               1.46
        80              1             80          124.0               1.55
        90              1             90          140.6               1.56
       100              1            100          541.1               5.41
       200              1            200          859.5               4.30
       300              1            300          330.8               1.10
       400              1            400         3308.4               8.27
       500              1            500          532.4               1.06
       600              1            600         4463.5               7.44
       700              1            700          822.9               1.18
       800              1            800         4147.5               5.18
       900              1            900         1068.8               1.19
      1000              1           1000         7225.7               7.23
      2000              1           2000         1985.7               0.99
      3000              1           3000        45247.7              15.08
      4000              1           4000         3883.8               0.97
      5000              1           5000        92639.4              18.53

# Cumulative distribution of time for a single client to read a
# random 100-byte object from a single server.  Each line indicates
# that a given fraction of all reads took at most a given time
# to complete.
# Generated by 'clusterperf.py readDist'
#
# Time (usec)  Cum. Fraction
#---------------------------
    0.00       0.000
    4.42       0.000
    4.56       0.010
    4.57       0.020
    4.58       0.030
    4.58       0.040
    4.59       0.050
    4.59       0.060
    4.60       0.070
    4.60       0.080
    4.61       0.090
    4.61       0.100
    4.62       0.110
    4.62       0.120
    4.63       0.130
    4.63       0.140
    4.63       0.150
    4.64       0.160
    4.64       0.170
    4.64       0.180
    4.64       0.190
    4.65       0.200
    4.65       0.210
    4.65       0.220
    4.65       0.230
    4.65       0.240
    4.66       0.250
    4.66       0.260
    4.66       0.270
    4.66       0.280
    4.66       0.290
    4.67       0.300
    4.67       0.310
    4.67       0.320
    4.67       0.330
    4.67       0.340
    4.68       0.350
    4.68       0.360
    4.68       0.370
    4.68       0.380
    4.68       0.390
    4.69       0.400
    4.69       0.410
    4.69       0.420
    4.69       0.430
    4.70       0.440
    4.70       0.450
    4.71       0.460
    4.71       0.470
    4.71       0.480
    4.72       0.490
    4.72       0.500
    4.72       0.510
    4.73       0.520
    4.73       0.530
    4.74       0.540
    4.74       0.550
    4.74       0.560
    4.75       0.570
    4.75       0.580
    4.76       0.590
    4.77       0.600
    4.77       0.610
    4.78       0.620
    4.79       0.630
    4.79       0.640
    4.80       0.650
    4.81       0.660
    4.82       0.670
    4.83       0.680
    4.83       0.690
    4.84       0.700
    4.86       0.710
    4.87       0.720
    4.88       0.730
    4.89       0.740
    4.91       0.750
    4.93       0.760
    4.95       0.770
    4.99       0.780
    5.06       0.790
    5.17       0.800
    5.22       0.810
    5.25       0.820
    5.28       0.830
    5.30       0.840
    5.32       0.850
    5.34       0.860
    5.36       0.870
    5.38       0.880
    5.40       0.890
    5.43       0.900
    5.48       0.910
    5.55       0.920
    5.71       0.930
    5.82       0.940
    5.87       0.950
    5.91       0.960
    5.98       0.970
    6.11       0.980
    7.08       0.990
   10.45       0.999
   84.21       0.9999
  103.29       1.000

# RAMCloud read throughput of a single server with a varying
# number of clients issuing individual reads on randomly
# chosen 100-byte objects with 30-byte keys
# Generated by 'clusterperf.py readThroughput'
#
# numClients   throughput     worker utiliz.
#              (kreads/sec)
#-------------------------------------------
    1              203           0.162
    2              386           0.342
    3              550           0.526
    4              680           0.654
    5              723           0.670
    6              844           0.845
    7              942           0.959
    8              978           1.020
    9              859           0.794
   10              903           0.844
   11              902           0.832
   12              905           0.841
   13              902           0.832
   14              902           0.836

# RAMCloud read performance for 100 B objects
# with keys of various lengths.
# Generated by 'clusterperf.py readVaryingKeyLength'
#
# Key Length      Latency (us)     Bandwidth (MB/s)
#----------------------------------------------------------------------------
           1              4.4                21.8
           5              4.4                22.5
          10              4.5                23.4
          15              4.5                24.4
          20              4.5                25.4
          25              4.5                26.4
          30              4.5                27.8
          35              4.6                28.2
          40              4.6                29.2
          45              4.6                30.2
          50              4.6                31.2
          55              4.6                32.1
          60              4.6                33.1
          65              4.6                34.2
          70              4.6                35.1
          75              4.7                35.8
          80              4.7                36.9
          85              4.7                37.7
          90              4.7                38.7
          95              4.7                39.4
         100              4.7                40.4
         200              5.5                52.0
         300              5.7                66.8
         400              6.1                78.2
         500              6.3                91.4
         600              6.3               105.2
         700              6.5               117.8
         800              6.6               129.6
         900              6.8               141.2
        1000              6.9               152.3
        2000              7.6               264.7
        3000              8.2               359.1
        4000              8.9               438.6
        5000              9.7               501.8
        6000             10.4               561.4
        7000             11.0               616.3
        8000             11.7               660.4
        9000             12.5               694.4
       10000             13.3               723.5
       20000             21.8               880.2
       30000             29.8               962.9
       40000             40.6               942.7
       50000             49.0               975.2
       60000             57.3              1001.0

# RAMCloud write performance for 100 B objects
# with keys of various lengths.
# Generated by 'clusterperf.py writeVaryingKeyLength'
#
# Key Length      Latency (us)     Bandwidth (MB/s)
#----------------------------------------------------------------------------
           1             12.9                 7.5
           5             12.9                 7.8
          10             12.9                 8.1
          15             12.9                 8.5
          20             13.0                 8.8
          25             13.1                 9.1
          30             13.2                 9.4
          35             13.1                 9.8
          40             13.2                10.1
          45             13.2                10.4
          50             13.3                10.8
          55             13.4                11.1
          60             13.4                11.4
          65             13.4                11.7
          70             13.4                12.1
          75             13.4                12.5
          80             13.4                12.8
          85             14.0                12.6
          90             14.0                12.9
          95             14.4                12.9
         100             14.4                13.3
         200             14.9                19.2
         300             15.7                24.3
         400             16.4                29.1
         500             17.1                33.4
         600             17.6                38.0
         700             18.1                42.2
         800             18.9                45.4
         900             19.6                48.6
        1000             20.0                52.4
        2000             23.6                85.0
        3000             27.4               107.9
        4000             31.4               124.7
        5000             35.2               138.1
        6000             39.0               149.3
        7000             42.9               157.7
        8000             47.2               163.8
        9000             51.4               168.8
       10000             55.4               173.8
       20000            104.0               184.3
       30000            151.1               190.0
       40000            198.1               193.1
       50000            238.3               200.5
       60000            283.4               202.2
 # RAMCloud index write, overwrite, lookup+readHashes, andIndexLookup class performance with varying number of objects. 1000 samples per operation takenafter 1 warmups.
# All keys are 30 bytes and the value of the object is fixed to be 100 bytes.
# Write and overwrite latencies are measured for the 'nth' object insertion where the size of the table is 'n-1'.
# Lookup, readHashes, and  latencies are measured by reading a single object when the size of the index is 'n'.
# All latency measurements are printed as 10th percentile/ median/ 90th percentile.
#
# Generated by 'clusterperf.py indexBasic'
#
#       n       write latency(us)   overwrite latency(us)         hash lookup(us)         lookup+read(us)         IndexLookup(us)   IndexLookup overhead
#--------------------------------------------------------------------------------------------------------------------------------------------------------
        1      26.6/  27.6/  33.6      28.5/  29.5/  32.1       4.9/   4.9/   6.1       9.3/   9.4/  10.5       9.6/   9.8/  10.9     0.34/  0.40/  0.39
       10      29.3/  30.5/  47.6      29.7/  30.8/  34.0       5.3/   5.4/   6.6       9.7/   9.8/  11.0      10.0/  10.2/  11.2     0.35/  0.34/  0.16
      100      30.5/  31.7/  36.2      42.2/  43.8/  47.8       6.0/   6.3/   7.7      10.4/  11.2/  12.5      10.8/  11.0/  12.4     0.36/ -0.24/ -0.07
     1000      31.1/  32.4/  43.8      49.3/  50.7/  56.4       6.4/   6.5/   7.7      10.8/  11.5/  12.2      11.2/  11.3/  12.3     0.34/ -0.23/  0.08
    10000      32.5/  34.1/  42.1      55.9/  57.6/  71.9       7.7/   9.0/  10.4      12.1/  13.8/  14.9      12.4/  13.3/  14.8     0.33/ -0.41/ -0.10
   100000      34.1/  35.5/  70.6      62.2/  64.1/  70.8       7.9/   8.4/   9.1      12.3/  13.0/  13.9      12.6/  12.8/  13.3     0.29/ -0.20/ -0.64
  1000000      34.3/  35.7/  74.6      65.6/  67.6/  97.2       8.7/   8.9/  10.2      13.3/  14.0/  15.2      13.5/  14.2/  15.0     0.25/  0.24/ -0.11
 
 
# RAMCloud write/overwrite performance for 1000th object insertion with varying number of index keys.
# The size of the table is 999 objects and is constant for this experiment. The latency measurements
# are printed as 10 percentile/ median/ 90 percentile
# Generated by 'clusterperf.py indexMultiple'
#
# Num secondary keys/obj        write latency (us)        overwrite latency (us)
#---------------------------------------------------------------------------------
                       0        11.6/  12.4/  35.1           12.5/  13.1/  14.4
                       1        31.0/  32.4/  39.8           31.1/  32.2/  34.9
                       2        33.1/  35.4/  71.9           32.8/  34.5/  39.7
                       3        35.4/  38.0/  68.8           35.3/  37.3/  66.9
                       4        35.8/  37.6/  83.0           35.6/  37.0/  62.8
                       5        38.5/  41.1/  98.4           38.5/  40.2/  53.6
                       6        39.3/  41.7/  80.4           38.6/  40.6/  64.0
                       7        39.8/  42.5/  98.4           39.5/  41.5/  74.1
                       8        43.9/  47.7/  98.2           43.4/  46.8/  83.9
                       9        44.2/  46.5/  74.7           43.4/  45.4/  59.3
                      10        47.0/  49.3/  83.0           47.5/  49.2/  73.8