clusterperf September 29, 2014

clusterperf output as measured on September 29, 2014 (rc1-rc20)

Recent changes that may have affected performance:

  • Cache prefetching improved latency for small RPCs
  • Multi-op scheduler rewritten: better throughput with multiple servers
  • InfRc optimized to retrieve multiple completions at once and use zeroCopy for results as well as requests
  • Use random reads in most clusterperf tests now (e.g., "basic" used to read a single object, now it randomizes)

 

basic.read100          4.6 us     read random 100B object (30B key) median
basic.read100.min      4.4 us     read random 100B object (30B key) minimum
basic.read100.9        5.4 us     read random 100B object (30B key) 90%
basic.read100.99       7.2 us     read random 100B object (30B key) 99%
basic.read100.999     72.2 us     read random 100B object (30B key) 99.9%
basic.readBw100       18.8 MB/s   bandwidth reading 100B objects (30B key)
basic.read1K           7.0 us     read random 1KB object (30B key) median
basic.read1K.min       6.1 us     read random 1KB object (30B key) minimum
basic.read1K.9         7.7 us     read random 1KB object (30B key) 90%
basic.read1K.99        9.4 us     read random 1KB object (30B key) 99%
basic.read1K.999      72.5 us     read random 1KB object (30B key) 99.9%
basic.readBw1K       130.1 MB/s   bandwidth reading 1KB objects (30B key)
basic.read10K         10.1 us     read random 10KB object (30B key) median
basic.read10K.min      9.0 us     read random 10KB object (30B key) minimum
basic.read10K.9       11.1 us     read random 10KB object (30B key) 90%
basic.read10K.99      12.1 us     read random 10KB object (30B key) 99%
basic.read10K.999     14.6 us     read random 10KB object (30B key) 99.9%
basic.readBw10K      925.1 MB/s   bandwidth reading 10KB objects (30B key)
basic.read100K        42.8 us     read random 100KB object (30B key) median
basic.read100K.min    36.6 us     read random 100KB object (30B key) minimum
basic.read100K.9      44.0 us     read random 100KB object (30B key) 90%
basic.read100K.99     45.3 us     read random 100KB object (30B key) 99%
basic.read100K.999   108.5 us     read random 100KB object (30B key) 99.9%
basic.readBw100K       2.2 GB/s   bandwidth reading 100KB objects (30B key)
basic.read1M         357.7 us     read random 1MB object (30B key) median
basic.read1M.min     320.4 us     read random 1MB object (30B key) minimum
basic.read1M.9       363.8 us     read random 1MB object (30B key) 90%
basic.read1M.99      381.2 us     read random 1MB object (30B key) 99%
basic.read1M.999     454.1 us     read random 1MB object (30B key) 99.9%
basic.readBw1M         2.6 GB/s   bandwidth reading 1MB objects (30B key)
basic.write100        15.0 us     write random 100B object (30B key) median
basic.write100.min    13.6 us     write random 100B object (30B key) minimum
basic.write100.9      16.2 us     write random 100B object (30B key) 90%
basic.write100.99     41.5 us     write random 100B object (30B key) 99%
basic.write100.999   153.9 us     write random 100B object (30B key) 99.9%
basic.writeBw100       5.9 MB/s   bandwidth writing 100B objects (30B key)
basic.write1K         19.4 us     write random 1KB object (30B key) median
basic.write1K.min     17.9 us     write random 1KB object (30B key) minimum
basic.write1K.9       20.8 us     write random 1KB object (30B key) 90%
basic.write1K.99     103.5 us     write random 1KB object (30B key) 99%
basic.write1K.999    175.8 us     write random 1KB object (30B key) 99.9%
basic.writeBw1K       44.1 MB/s   bandwidth writing 1KB objects (30B key)
basic.write10K        35.3 us     write random 10KB object (30B key) median
basic.write10K.min    32.5 us     write random 10KB object (30B key) minimum
basic.write10K.9      37.7 us     write random 10KB object (30B key) 90%
basic.write10K.99    208.5 us     write random 10KB object (30B key) 99%
basic.write10K.999   287.3 us     write random 10KB object (30B key) 99.9%
basic.writeBw10K     233.2 MB/s   bandwidth writing 10KB objects (30B key)
basic.write100K      228.2 us     write random 100KB object (30B key) median
basic.write100K.min  209.3 us     write random 100KB object (30B key) minimum
basic.write100K.9    311.3 us     write random 100KB object (30B key) 90%
basic.write100K.99   425.7 us     write random 100KB object (30B key) 99%
basic.write100K.999  489.2 us     write random 100KB object (30B key) 99.9%
basic.writeBw100K    383.1 MB/s   bandwidth writing 100KB objects (30B key)
basic.write1M          2.2 ms     write random 1MB object (30B key) median
basic.write1M.min      2.1 ms     write random 1MB object (30B key) minimum
basic.write1M.9        2.3 ms     write random 1MB object (30B key) 90%
basic.write1M.99       2.4 ms     write random 1MB object (30B key) 99%
basic.write1M.999      2.7 ms     write random 1MB object (30B key) 99.9%
basic.writeBw1M      431.2 MB/s   bandwidth writing 1MB objects (30B key)
# RAMCloud multiRead performance for 100 B objects with 30 byte keys
# located on a single master.
# Generated by 'clusterperf.py multiRead_oneMaster'
#
# Num Objs    Num Masters    Objs/Master    Latency (us)    Latency/Obj (us)
#----------------------------------------------------------------------------
         1              1              1            5.5               5.46
         2              1              2            6.7               3.34
         3              1              3            7.8               2.59
         4              1              4            8.5               2.11
         5              1              5            9.4               1.89
         6              1              6           10.6               1.77
         7              1              7           11.1               1.59
         8              1              8           11.7               1.47
         9              1              9           12.6               1.40
        10              1             10           13.7               1.37
        20              1             20           20.2               1.01
        30              1             30           27.1               0.90
        40              1             40           33.7               0.84
        50              1             50           40.3               0.81
        60              1             60           42.9               0.71
        70              1             70           49.1               0.70
        80              1             80           55.3               0.69
        90              1             90           54.9               0.61
       100              1            100           56.1               0.56
       200              1            200           75.1               0.38
       300              1            300           94.7               0.32
       400              1            400          112.2               0.28
       500              1            500          129.5               0.26
       600              1            600          149.5               0.25
       700              1            700          168.4               0.24
       800              1            800          188.7               0.24
       900              1            900          217.4               0.24
      1000              1           1000          218.6               0.22
      2000              1           2000          431.2               0.22
      3000              1           3000          648.7               0.22
      4000              1           4000          869.4               0.22
      5000              1           5000         1095.2               0.22
# RAMCloud multiRead performance for 100 B objects with 30 byte keys
# with one object located on each master.
# Generated by 'clusterperf.py multiRead_oneObjectPerMaster'
#
# Num Objs    Num Masters    Objs/Master    Latency (us)    Latency/Obj (us)
#----------------------------------------------------------------------------
         1              1              1            5.3               5.30
         2              2              1            6.8               3.40
         3              3              1            7.6               2.54
         4              4              1            9.0               2.24
         5              5              1           10.2               2.04
         6              6              1           11.7               1.96
         7              7              1           13.1               1.86
         8              8              1           14.9               1.86
         9              9              1           15.1               1.68
        10             10              1           17.0               1.70
        11             11              1           20.7               1.88
        12             12              1           24.3               2.02
        13             13              1           24.2               1.86
        14             14              1           26.0               1.86
        15             15              1           27.6               1.84
        16             16              1           31.8               1.99
        17             17              1           31.1               1.83
        18             18              1           34.5               1.92
        19             19              1           34.9               1.84
        20             20              1           37.4               1.87
# RAMCloud multi-read throughput of a single server with a
# varying number of clients issuing 70-object multi-reads on
# randomly-chosen 100-byte objects with 30-byte keys
# Generated by 'clusterperf.py multiReadThroughput'
#
# numClients   throughput     worker utiliz.
#              (kreads/sec)
#-------------------------------------------
    1             1475           0.424
    2             2516           0.980
    3             3386           1.531
    4             4270           1.959
    5             4968           2.597
    6             5518           2.834
    7             6189           2.947
    8             6203           2.944
    9             5064           2.802
   10             4914           2.710
   11             5536           2.826
   12             5616           2.825
   13             6040           2.914
   14             5095           2.770
   15             4885           2.708
# RAMCloud multiWrite performance for 100 B objects with 30 byte keys
# located on a single master.
# Generated by 'clusterperf.py multiWrite_oneMaster'
#
# Num Objs    Num Masters    Objs/Master    Latency (us)    Latency/Obj (us)
#----------------------------------------------------------------------------
         1              1              1           16.7              16.69
         2              1              2           19.6               9.80
         3              1              3           22.0               7.32
         4              1              4           23.2               5.81
         5              1              5           24.3               4.86
         6              1              6           26.6               4.43
         7              1              7           28.2               4.03
         8              1              8           30.2               3.78
         9              1              9           32.0               3.56
        10              1             10           32.3               3.23
        20              1             20           51.2               2.56
        30              1             30           62.5               2.08
        40              1             40           79.2               1.98
        50              1             50           94.2               1.88
        60              1             60          219.1               3.65
        70              1             70          122.6               1.75
        80              1             80          136.9               1.71
        90              1             90          143.8               1.60
       100              1            100          141.8               1.42
       200              1            200         3983.5              19.92
       300              1            300         1072.0               3.57
       400              1            400          375.1               0.94
       500              1            500         5551.4              11.10
       600              1            600          558.6               0.93
       700              1            700         4498.6               6.43
       800              1            800          802.7               1.00
       900              1            900         6116.5               6.80
      1000              1           1000          898.4               0.90
      2000              1           2000        28052.8              14.03
      3000              1           3000         6901.3               2.30
      4000              1           4000         9208.7               2.30
      5000              1           5000        12261.8               2.45
# Cumulative distribution of time for a single client to read a
# random 100-byte object from a single server.  Each line indicates
# that a given fraction of all reads took at most a given time
# to complete.
# Generated by 'clusterperf.py readDist'
#
# Time (usec)  Cum. Fraction
#---------------------------
    0.00       0.000
    4.41       0.000
    4.52       0.010
    4.54       0.020
    4.55       0.030
    4.56       0.040
    4.57       0.050
    4.58       0.060
    4.59       0.070
    4.59       0.080
    4.60       0.090
    4.60       0.100
    4.61       0.110
    4.61       0.120
    4.62       0.130
    4.62       0.140
    4.62       0.150
    4.63       0.160
    4.63       0.170
    4.63       0.180
    4.64       0.190
    4.64       0.200
    4.64       0.210
    4.65       0.220
    4.65       0.230
    4.65       0.240
    4.66       0.250
    4.66       0.260
    4.66       0.270
    4.67       0.280
    4.67       0.290
    4.67       0.300
    4.68       0.310
    4.68       0.320
    4.68       0.330
    4.69       0.340
    4.69       0.350
    4.69       0.360
    4.69       0.370
    4.70       0.380
    4.70       0.390
    4.70       0.400
    4.70       0.410
    4.71       0.420
    4.71       0.430
    4.71       0.440
    4.71       0.450
    4.72       0.460
    4.72       0.470
    4.72       0.480
    4.73       0.490
    4.73       0.500
    4.73       0.510
    4.73       0.520
    4.74       0.530
    4.74       0.540
    4.74       0.550
    4.75       0.560
    4.75       0.570
    4.75       0.580
    4.76       0.590
    4.76       0.600
    4.76       0.610
    4.77       0.620
    4.77       0.630
    4.77       0.640
    4.78       0.650
    4.78       0.660
    4.79       0.670
    4.79       0.680
    4.80       0.690
    4.80       0.700
    4.81       0.710
    4.81       0.720
    4.82       0.730
    4.82       0.740
    4.83       0.750
    4.84       0.760
    4.85       0.770
    4.86       0.780
    4.88       0.790
    4.90       0.800
    4.92       0.810
    4.97       0.820
    5.12       0.830
    5.24       0.840
    5.28       0.850
    5.32       0.860
    5.35       0.870
    5.37       0.880
    5.40       0.890
    5.43       0.900
    5.47       0.910
    5.54       0.920
    5.70       0.930
    5.79       0.940
    5.84       0.950
    5.89       0.960
    5.94       0.970
    6.00       0.980
    6.22       0.990
   10.28       0.999
   97.88       0.9999
  112.01       1.000
# RAMCloud read throughput of a single server with a varying
# number of clients issuing individual reads on randomly
# chosen 100-byte objects with 30-byte keys
# Generated by 'clusterperf.py readThroughput'
#
# numClients   throughput     worker utiliz.
#              (kreads/sec)
#-------------------------------------------
    1              202           0.145
    2              386           0.289
    3              556           0.455
    4              698           0.597
    5              705           0.589
    6              774           0.646
    7              836           0.716
    8              866           0.735
    9              926           0.797
   10              895           0.807
   11              943           0.804
   12              955           0.815
   13              914           0.786
   14              915           0.776
   15              941           0.798
# RAMCloud read performance for 100 B objects
# with keys of various lengths.
# Generated by 'clusterperf.py readVaryingKeyLength'
#
# Key Length      Latency (us)     Bandwidth (MB/s)
#----------------------------------------------------------------------------
           1              4.4                21.9
           5              4.4                22.8
          10              4.4                23.7
          15              4.5                24.6
          20              4.4                25.8
          25              4.5                26.7
          30              4.4                28.1
          35              4.5                28.4
          40              4.5                29.4
          45              4.5                30.4
          50              4.6                31.4
          55              4.6                32.3
          60              4.6                33.4
          65              4.6                34.3
          70              4.6                35.4
          75              4.6                36.1
          80              4.6                37.1
          85              4.6                38.0
          90              4.6                39.1
          95              4.7                39.8
         100              4.7                40.8
         200              5.4                52.8
         300              5.6                68.1
         400              6.0                79.2
         500              6.2                92.5
         600              6.3               106.1
         700              6.4               118.4
         800              6.6               130.6
         900              6.7               142.5
        1000              6.8               153.3
        2000              7.5               266.4
        3000              8.2               362.0
        4000              8.8               444.3
        5000              9.6               507.0
        6000             10.3               567.5
        7000             10.9               619.4
        8000             11.6               667.5
        9000             12.4               701.5
       10000             13.1               732.7
       20000             21.4               897.4
       30000             29.4               977.8
       40000             39.7               964.5
       50000             47.9               996.7
       60000             55.9              1024.7
# RAMCloud write performance for 100 B objects
# with keys of various lengths.
# Generated by 'clusterperf.py writeVaryingKeyLength'
#
# Key Length      Latency (us)     Bandwidth (MB/s)
#----------------------------------------------------------------------------
           1             14.2                 6.8
           5             14.1                 7.1
          10             14.2                 7.4
          15             14.3                 7.7
          20             14.3                 8.0
          25             14.3                 8.3
          30             14.4                 8.6
          35             14.5                 8.9
          40             14.5                 9.2
          45             14.5                 9.5
          50             14.5                 9.9
          55             14.9                 9.9
          60             14.7                10.4
          65             14.6                10.8
          70             14.7                11.0
          75             14.7                11.3
          80             14.7                11.7
          85             15.2                11.6
          90             15.5                11.7
          95             15.6                11.9
         100             15.6                12.3
         200             16.1                17.8
         300             16.9                22.5
         400             17.5                27.2
         500             18.0                31.8
         600             18.4                36.4
         700             18.9                40.4
         800             19.9                43.2
         900             21.1                45.2
        1000             21.5                48.8
        2000             25.0                80.2
        3000             28.7               102.9
        4000             32.5               120.3
        5000             36.1               134.6
        6000             39.9               145.7
        7000             43.7               154.8
        8000             47.5               162.7
        9000             52.1               166.5
       10000             56.3               171.1
       20000            100.6               190.5
       30000            144.2               199.1
       40000            192.0               199.2
       50000            236.9               201.7
       60000            280.4               204.4