clusterperf June 19, 2014

clusterperf output as measured on June 19, 2014 (rc1-rc20)

Recent changes that may have affected performance:

  • Complete rewrite of Buffer; should be significantly faster
  • ObjectFinder now caches sessions, eliminating calls to TransportManager and an additional hash table lookup there.
  • Object representation has changed to support multiple keys; this probably introduces additional overheads.
  • Basic index operations are now supported, and there are some new tests for those.
basic.read100          5.0 us     read single 100B object with 30B key
basic.readBw100       19.0 MB/s   bandwidth reading 100B object with 30B key
basic.read1K           7.1 us     read single 1KB object with 30B key
basic.readBw1K       135.1 MB/s   bandwidth reading 1KB object with 30B key
basic.read10K         10.5 us     read single 10KB object with 30B key
basic.readBw10K      912.0 MB/s   bandwidth reading 10KB object with 30B key
basic.read100K        47.8 us     read single 100KB object with 30B key
basic.readBw100K       1.9 GB/s   bandwidth reading 100KB object with 30B key
basic.read1M         421.2 us     read single 1MB object with 30B key
basic.readBw1M         2.2 GB/s   bandwidth reading 1MB object with 30B key
basic.write100        15.9 us     write single 100B object with 30B key
basic.writeBw100       6.0 MB/s   bandwidth writing 100B object with 30B key
basic.write1K         20.3 us     write single 1KB object with 30B key
basic.writeBw1K       47.0 MB/s   bandwidth writing 1KB object with 30B key
basic.write10K        37.4 us     write single 10KB object with 30B key
basic.writeBw10K     254.8 MB/s   bandwidth writing 10KB object with 30B key
basic.write100K      229.2 us     write single 100KB object with 30B key
basic.writeBw100K    416.1 MB/s   bandwidth writing 100KB object with 30B key
basic.write1M          2.2 ms     write single 1MB object with 30B key
basic.writeBw1M      431.7 MB/s   bandwidth writing 1MB object with 30B key
broadcast            142.0 us     broadcast message to 9 slaves
readNotFound          13.6 us     read object that doesn't exist
# RAMCloud index write, overwrite, lookup and read performance with varying number of objects.
# All keys are 30 bytes and the value of the object is fixed to be 100 bytes.
# Write and overwrite latencies are measured for the 'nth' object insertion where the size of the
# table is 'n-1'. Lookup and indexedRead latencies are measured when the size of the index is 'n'.
# All latency measurements are printed as 10 percentile/ median/ 90 percentile.
# Generated by 'clusterperf.py indexBasic'
#
#       n       write latency(us)        overwrite latency(us)       lookup latency(us)       lookup+read latency(us)
#----------------------------------------------------------------------------------------------------------------------
        1      31.6/  32.9/  49.0          33.4/  34.6/  61.2        5.0/   5.1/   5.8         10.0/  10.1/  11.3

       10      33.7/  35.1/  61.2          34.2/  35.8/  72.1        5.6/   5.6/   6.3         10.5/  10.6/  11.8

      100      34.7/  36.2/  46.0          40.0/  41.8/  71.7        6.2/   6.3/   6.9         11.2/  11.3/  12.5

     1000      35.9/  37.0/  46.5          36.7/  37.9/  63.9        7.0/   7.0/   7.6         11.9/  12.1/  13.3

    10000      36.9/  38.4/  48.1          42.1/  43.7/  61.0        7.7/   7.7/   8.4         12.6/  12.7/  13.9

   100000      36.7/  38.0/  46.9          37.4/  38.9/  77.8        8.2/   8.3/   8.9         13.1/  13.2/  14.4

  1000000      37.8/  39.1/  52.9          38.3/  39.8/  65.0        9.2/   9.3/   9.6         14.2/  14.2/  15.4

# RAMCloud write/overwrite performance for 1000th object insertion with varying number of index keys.
# The size of the table is 999 objects and is constant for this experiment. The latency measurements
# are printed as 10 percentile/ median/ 90 percentile
# Generated by 'clusterperf.py indexMultiple'
#
# Num secondary keys/obj        write latency (us)        overwrite latency (us)
#---------------------------------------------------------------------------------
                       0        13.7/  14.3/  16.3           14.4/  15.0/  16.9
                       1        36.5/  37.6/  44.6           36.4/  37.9/  47.2
                       2        38.4/  40.1/  77.4           38.4/  40.1/  67.9
                       3        42.0/  43.8/  84.6           42.0/  43.8/  89.8
                       4        44.8/  46.9/  89.1           44.7/  46.4/  79.4
                       5        47.5/  49.8/  98.7           46.9/  49.2/  95.3
                       6        48.2/  51.8/  89.9           47.9/  51.4/  93.0
                       7        49.9/  52.4/ 107.5           50.0/  52.2/  97.1
                       8        52.5/  55.1/  86.3           52.2/  54.8/  88.4
                       9        56.0/  61.2/ 101.3           55.4/  60.7/ 104.7
                      10        54.8/  58.7/ 116.0           53.9/  57.4/ 117.5
# RAMCloud index scalability when 1 or more clients lookup/read
# 100-byte objects with 30-byte keys chosen at random from
# 1 indexlets.
# Generated by 'clusterperf.py indexScalability'
#
# numClients  throughput(klookups/sec)
#-------------------------------------
  1                  144
  2                  271
  3                  373
  4                  373
  5                  366
  6                  364
  7                  363
  8                  360
  9                  360
 10                  298
# RAMCloud multiRead performance for an approximately fixed number
# of 100 B objects with 30 byte keys
# distributed evenly across varying number of masters.
# Generated by 'clusterperf.py multiRead_general'
#
# Num Objs    Num Masters    Objs/Master    Latency (us)    Latency/Obj (us)
#----------------------------------------------------------------------------
      5000              1           5000         2590.4               0.52
      5000              2           2500         4241.6               0.85
      4998              3           1666         3222.8               0.64
      5000              4           1250         2315.6               0.46
      5000              5           1000         2462.4               0.49
      4998              6            833         1989.2               0.40
      4998              7            714         3210.0               0.64
      5000              8            625         1755.2               0.35
      4995              9            555         2800.8               0.56
      5000             10            500         1825.0               0.37
      4994             11            454         1681.5               0.34
      4992             12            416         1878.5               0.38
      4992             13            384         1966.6               0.39
      4998             14            357         1623.9               0.32
      4995             15            333         2159.7               0.43
      4992             16            312         2451.7               0.49
      4998             17            294         2023.1               0.40
      4986             18            277         1918.3               0.38
      4997             19            263         1929.3               0.39
# RAMCloud multiRead performance for an approximately fixed number
# of 100 B objects with 30 byte keys
# distributed evenly across varying number of masters.
# Requests are issued in a random order.
# Generated by 'clusterperf.py multiRead_generalRandom'
#
# Num Objs    Num Masters    Objs/Master    Latency (us)    Latency/Obj (us)
#----------------------------------------------------------------------------
      5000              1           5000         3649.6               0.73
      5000              2           2500         1653.2               0.33
      4998              3           1666         1531.4               0.31
      5000              4           1250         1502.9               0.30
      5000              5           1000         1580.2               0.32
      4998              6            833         2137.4               0.43
      4998              7            714         1614.1               0.32
      5000              8            625         1757.6               0.35
      4995              9            555         1727.2               0.35
      5000             10            500         1780.2               0.36
      4994             11            454         1824.5               0.37
      4992             12            416         1835.7               0.37
      4992             13            384         2017.5               0.40
      4998             14            357         1854.5               0.37
      4995             15            333         2024.4               0.41
      4992             16            312         2206.4               0.44
      4998             17            294         3338.2               0.67
      4986             18            277         3575.3               0.72
      4997             19            263         3579.4               0.72
# RAMCloud multiWrite performance for 100 B objects with 30 byte keys
# located on a single master.
# Generated by 'clusterperf.py multiWrite_oneMaster'
#
# Num Objs    Num Masters    Objs/Master    Latency (us)    Latency/Obj (us)
#----------------------------------------------------------------------------
         1              1              1           16.7              16.70
         2              1              2           20.0              10.00
         3              1              3           22.1               7.36
         4              1              4           22.7               5.66
         5              1              5           24.6               4.92
         6              1              6           26.5               4.41
         7              1              7           26.8               3.83
         8              1              8           28.6               3.57
         9              1              9           32.0               3.56
        10              1             10           33.9               3.39
        20              1             20           49.3               2.46
        30              1             30           63.0               2.10
        40              1             40           80.2               2.00
        50              1             50           90.5               1.81
        60              1             60          107.0               1.78
        70              1             70          121.7               1.74
        80              1             80          135.1               1.69
        90              1             90          136.6               1.52
       100              1            100          138.3               1.38
       200              1            200          217.8               1.09
       300              1            300          348.2               1.16
       400              1            400          504.5               1.26
       500              1            500          600.5               1.20
       600              1            600          680.6               1.13
       700              1            700          783.2               1.12
       800              1            800          922.0               1.15
       900              1            900         1026.7               1.14
      1000              1           1000         1098.5               1.10
      2000              1           2000         2188.2               1.09
      3000              1           3000         3326.1               1.11
      4000              1           4000         4280.9               1.07
      5000              1           5000         5343.8               1.07
# RAMCloud multiRead performance for 100 B objects with 30 byte keys
# located on a single master.
# Generated by 'clusterperf.py multiRead_oneMaster'
#
# Num Objs    Num Masters    Objs/Master    Latency (us)    Latency/Obj (us)
#----------------------------------------------------------------------------
         1              1              1            5.3               5.28
         2              1              2            6.9               3.43
         3              1              3            7.9               2.64
         4              1              4            8.7               2.18
         5              1              5            9.4               1.88
         6              1              6           10.6               1.77
         7              1              7           11.3               1.61
         8              1              8           11.8               1.48
         9              1              9           12.6               1.40
        10              1             10           13.2               1.32
        20              1             20           18.9               0.95
        30              1             30           25.9               0.86
        40              1             40           31.5               0.79
        50              1             50           37.7               0.75
        60              1             60           43.1               0.72
        70              1             70           49.1               0.70
        80              1             80           54.4               0.68
        90              1             90           55.8               0.62
       100              1            100           58.5               0.59
       200              1            200          101.3               0.51
       300              1            300          170.7               0.57
       400              1            400          245.2               0.61
       500              1            500          298.3               0.60
       600              1            600          346.4               0.58
       700              1            700          418.0               0.60
       800              1            800          478.0               0.60
       900              1            900          533.4               0.59
      1000              1           1000          616.9               0.62
      2000              1           2000         1291.8               0.65
      3000              1           3000         1771.6               0.59
      4000              1           4000         2363.9               0.59
      5000              1           5000         2943.4               0.59
# RAMCloud multiRead performance for 100 B objects with 30 byte keys
# with one object located on each master.
# Generated by 'clusterperf.py multiRead_oneObjectPerMaster'
#
# Num Objs    Num Masters    Objs/Master    Latency (us)    Latency/Obj (us)
#----------------------------------------------------------------------------
         1              1              1            5.6               5.58
         2              2              1            6.3               3.15
         3              3              1            7.3               2.43
         4              4              1            8.3               2.08
         5              5              1           11.8               2.36
         6              6              1           10.6               1.77
         7              7              1           11.4               1.62
         8              8              1           12.7               1.59
         9              9              1           14.0               1.55
        10             10              1           15.0               1.50
        11             11              1           19.2               1.75
        12             12              1           23.3               1.94
        13             13              1           20.1               1.54
        14             14              1           20.9               1.49
        15             15              1           25.1               1.67
        16             16              1           30.2               1.89
        17             17              1           30.0               1.77
        18             18              1           35.4               1.96
        19             19              1           29.6               1.56
# Cumulative distribution of time for a single client to read a
# single 100-byte object from a single server.  Each line indicates
# that a given fraction of all reads took at most a given time
# to complete.
# Generated by 'clusterperf.py readDist'
#
# Time (usec)  Cum. Fraction
#---------------------------
    0.00       0.000
    4.56       0.000
    4.60       0.010
    4.60       0.020
    4.61       0.030
    4.61       0.040
    4.62       0.050
    4.62       0.060
    4.62       0.070
    4.63       0.080
    4.63       0.090
    4.64       0.100
    4.64       0.110
    4.64       0.120
    4.65       0.130
    4.65       0.140
    4.65       0.150
    4.66       0.160
    4.67       0.170
    4.67       0.180
    4.68       0.190
    4.68       0.200
    4.68       0.210
    4.68       0.220
    4.68       0.230
    4.69       0.240
    4.69       0.250
    4.69       0.260
    4.69       0.270
    4.69       0.280
    4.69       0.290
    4.69       0.300
    4.69       0.310
    4.69       0.320
    4.69       0.330
    4.69       0.340
    4.69       0.350
    4.70       0.360
    4.70       0.370
    4.70       0.380
    4.70       0.390
    4.70       0.400
    4.70       0.410
    4.70       0.420
    4.70       0.430
    4.70       0.440
    4.70       0.450
    4.70       0.460
    4.70       0.470
    4.70       0.480
    4.70       0.490
    4.70       0.500
    4.71       0.510
    4.71       0.520
    4.71       0.530
    4.71       0.540
    4.71       0.550
    4.71       0.560
    4.71       0.570
    4.71       0.580
    4.71       0.590
    4.71       0.600
    4.71       0.610
    4.71       0.620
    4.71       0.630
    4.71       0.640
    4.72       0.650
    4.72       0.660
    4.72       0.670
    4.72       0.680
    4.72       0.690
    4.72       0.700
    4.72       0.710
    4.73       0.720
    4.73       0.730
    4.73       0.740
    4.74       0.750
    4.75       0.760
    4.76       0.770
    4.77       0.780
    4.78       0.790
    4.79       0.800
    4.81       0.810
    4.83       0.820
    4.85       0.830
    4.87       0.840
    4.91       0.850
    5.16       0.860
    5.35       0.870
    5.37       0.880
    5.38       0.890
    5.39       0.900
    5.40       0.910
    5.50       0.920
    5.85       0.930
    5.90       0.940
    5.91       0.950
    5.92       0.960
    5.94       0.970
    5.99       0.980
    6.41       0.990
   66.84       0.999
  102.11       0.9999
  112.25       1.000
# RAMCloud read performance as a function of load (1 or more
# clients all reading a single 100-byte object with 30-byte key
# repeatedly).
# Generated by 'clusterperf.py readLoaded'
#
# numClients  readLatency(us)  throughput(total kreads/sec)
#----------------------------------------------------------
    1            5.1               195
    2            5.3               377
    3            5.4               553
    4            7.7               523
    5            9.0               557
    6            8.4               711
    7            9.8               715
    8           11.2               714
    9           12.8               702
   10           14.2               704
   11           15.7               699
   12           16.8               714
   13           18.6               700
   14           19.8               707
   15           21.3               703
   16           22.6               706
   17           53.6               317
   18           26.3               684
   19           27.8               683
   20           29.2               685
# RAMCloud read performance when 1 or more clients read
# 100-byte objects with 30-byte keys chosen at random from
# 1 servers.
# Generated by 'clusterperf.py readRandom'
#
# numClients  throughput(total kreads/sec)  slowest(ms)  reads > 10us
#--------------------------------------------------------------------
  1                  175                      6.45          0.2%
  2                  378                      0.03          0.1%
  3                  506                      0.05          0.4%
  4                  635                      0.09          0.2%
  5                  679                      0.09          0.6%
  6                  696                      0.09          3.3%
  7                  698                      0.31          69.2%
  8                  713                      0.30          87.0%
  9                  716                      0.11          88.5%
 10                  722                      0.11          91.8%
 11                  824                      0.10          90.4%
 12                  987                      0.12          83.9%
 13                  947                      0.11          88.6%
 14                  929                      0.11          92.6%
 15                  928                      0.11          93.1%
 16                  936                      0.30          93.3%
# RAMCloud read performance for 100 B objects
# with keys of various lengths.
# Generated by 'clusterperf.py readVaryingKeyLength'
#
# Key Length      Latency (us)     Bandwidth (MB/s)
#----------------------------------------------------------------------------
           1              4.9                 0.2
           5              4.8                 1.0
          10              4.8                 2.0
          15              4.9                 2.9
          20              4.9                 3.9
          25              4.9                 4.8
          30              5.0                 5.7
          35              5.1                 6.5
          40              5.1                 7.5
          45              5.0                 8.5
          50              5.1                 9.3
          55              4.9                10.7
          60              4.9                11.7
          65              4.9                12.6
          70              5.0                13.3
          75              5.2                13.8
          80              5.1                14.9
          85              5.2                15.6
          90              4.9                17.3
          95              5.1                17.7
         100              5.2                18.4
         200              6.0                31.7
         300              6.3                45.7
         400              6.5                58.6
         500              6.8                69.7
         600              7.0                81.2
         700              7.1                93.8
         800              7.2               106.0
         900              7.1               121.7
        1000              7.2               132.4
        2000              8.0               238.2
        3000              8.8               326.6
        4000              9.3               410.8
        5000             10.0               475.4
        6000             10.6               540.0
        7000             11.3               589.1
        8000             11.9               640.2
        9000             12.8               672.0
       10000             13.5               706.3
       20000             21.4               890.5
       30000             29.2               981.3
       40000             39.7               961.8
       50000             49.5               962.9
       60000             57.4               996.5
# Gauges impact of asynchronous writes on synchronous writes.
# Write two values. The size of the first varies over trials
# (its size is given as 'firstObjectSize'). The first write is
# either synchronous (if firstWriteIsSync is 1) or asynchronous
# (if firstWriteIsSync is 0). The response time of the first
# write is given by 'firstWriteLatency'. The second write is
# a 100 B object which is always written synchronously (its 
# response time is given by 'syncWriteLatency'
# Both writes use a 30 B key.
# Generated by 'clusterperf.py writeAsyncSync'
#
# firstWriteIsSync firstObjectSize firstWriteLatency(us) syncWriteLatency(us)
#----------------------------------------------------------------------------
                 0             100                  16.3                 16.4
                 0            1000                  20.3                 16.3
                 0           10000                  38.9                 15.5
                 0          100000                 230.9                 18.4
                 0         1000000                2133.7                 23.7
                 1             100                  15.9                 15.1
                 1            1000                  19.1                 15.1
                 1           10000                  36.6                 15.4
                 1          100000                 224.9                 19.0
                 1         1000000                2149.8                 24.0
# RAMCloud write performance for 100 B objects
# with keys of various lengths.
# Generated by 'clusterperf.py writeVaryingKeyLength'
#
# Key Length      Latency (us)     Bandwidth (MB/s)
#----------------------------------------------------------------------------
           1             16.1                 0.1
           5             15.8                 0.3
          10             15.6                 0.6
          15             15.3                 0.9
          20             15.4                 1.2
          25             15.5                 1.5
          30             16.3                 1.8
          35             16.4                 2.0
          40             16.5                 2.3
          45             16.7                 2.6
          50             16.2                 2.9
          55             16.0                 3.3
          60             15.6                 3.7
          65             15.7                 4.0
          70             15.4                 4.3
          75             16.1                 4.4
          80             15.6                 4.9
          85             16.0                 5.1
          90             16.4                 5.2
          95             16.8                 5.4
         100             16.5                 5.8
         200             17.5                10.9
         300             18.4                15.6
         400             19.1                20.0
         500             19.8                24.1
         600             19.9                28.7
         700             21.3                31.3
         800             21.9                34.8
         900             22.6                37.9
        1000             24.5                38.9
        2000             29.3                65.2
        3000             33.3                85.9
        4000             37.8               101.0
        5000             42.9               111.1
        6000             46.4               123.4
        7000             50.0               133.5
        8000             52.3               145.9
        9000             56.8               151.2
       10000             61.3               155.6
       20000            107.7               177.1
       30000            155.9               183.5
       40000            202.4               188.5
       50000            252.4               188.9
       60000            294.2               194.5