clusterperf January 2016 infrc
clusterperf output as measured on January 6, 2016 (rc1-rc20, infrc transport)
Recent changes that may have affected performance:
- Added support for linearizability, which affects the critical path of some RPCs
Numerous other small changes over the last year (e.g. new RPC level mechanism)
basic.read100 4.6 us read random 100B object (30B key) median basic.read100.min 4.2 us read random 100B object (30B key) minimum basic.read100.9 5.3 us read random 100B object (30B key) 90% basic.read100.99 5.9 us read random 100B object (30B key) 99% basic.read100.999 8.2 us read random 100B object (30B key) 99.9% basic.readBw100 19.8 MB/s bandwidth reading 100B objects (30B key) basic.read1K 6.3 us read random 1KB object (30B key) median basic.read1K.min 5.4 us read random 1KB object (30B key) minimum basic.read1K.9 7.0 us read random 1KB object (30B key) 90% basic.read1K.99 7.7 us read random 1KB object (30B key) 99% basic.read1K.999 10.0 us read random 1KB object (30B key) 99.9% basic.readBw1K 146.8 MB/s bandwidth reading 1KB objects (30B key) basic.read10K 9.4 us read random 10KB object (30B key) median basic.read10K.min 8.3 us read random 10KB object (30B key) minimum basic.read10K.9 10.4 us read random 10KB object (30B key) 90% basic.read10K.99 11.3 us read random 10KB object (30B key) 99% basic.read10K.999 13.1 us read random 10KB object (30B key) 99.9% basic.readBw10K 990.1 MB/s bandwidth reading 10KB objects (30B key) basic.read100K 42.1 us read random 100KB object (30B key) median basic.read100K.min 35.7 us read random 100KB object (30B key) minimum basic.read100K.9 43.3 us read random 100KB object (30B key) 90% basic.read100K.99 44.4 us read random 100KB object (30B key) 99% basic.read100K.999 47.3 us read random 100KB object (30B key) 99.9% basic.readBw100K 2.2 GB/s bandwidth reading 100KB objects (30B key) basic.read1M 357.4 us read random 1MB object (30B key) median basic.read1M.min 324.0 us read random 1MB object (30B key) minimum basic.read1M.9 363.1 us read random 1MB object (30B key) 90% basic.read1M.99 365.7 us read random 1MB object (30B key) 99% basic.read1M.999 419.5 us read random 1MB object (30B key) 99.9% basic.readBw1M 2.6 GB/s bandwidth reading 1MB objects (30B key) basic.write100 14.0 us write random 100B object (30B key) median basic.write100.min 12.8 us write random 100B object (30B key) minimum basic.write100.9 15.1 us write random 100B object (30B key) 90% basic.write100.99 17.7 us write random 100B object (30B key) 99% basic.write100.999 101.7 us write random 100B object (30B key) 99.9% basic.writeBw100 6.6 MB/s bandwidth writing 100B objects (30B key) basic.write1K 16.8 us write random 1KB object (30B key) median basic.write1K.min 15.4 us write random 1KB object (30B key) minimum basic.write1K.9 17.9 us write random 1KB object (30B key) 90% basic.write1K.99 21.0 us write random 1KB object (30B key) 99% basic.write1K.999 105.2 us write random 1KB object (30B key) 99.9% basic.writeBw1K 55.4 MB/s bandwidth writing 1KB objects (30B key) basic.write10K 34.1 us write random 10KB object (30B key) median basic.write10K.min 31.7 us write random 10KB object (30B key) minimum basic.write10K.9 35.9 us write random 10KB object (30B key) 90% basic.write10K.99 40.5 us write random 10KB object (30B key) 99% basic.write10K.999 130.9 us write random 10KB object (30B key) 99.9% basic.writeBw10K 216.2 MB/s bandwidth writing 10KB objects (30B key) basic.write100K 222.9 us write random 100KB object (30B key) median basic.write100K.min 204.7 us write random 100KB object (30B key) minimum basic.write100K.9 230.0 us write random 100KB object (30B key) 90% basic.write100K.99 323.1 us write random 100KB object (30B key) 99% basic.write100K.999 25.4 ms write random 100KB object (30B key) 99.9% basic.writeBw100K 220.4 MB/s bandwidth writing 100KB objects (30B key) basic.write1M 2.1 ms write random 1MB object (30B key) median basic.write1M.min 2.0 ms write random 1MB object (30B key) minimum basic.write1M.9 2.4 ms write random 1MB object (30B key) 90% basic.write1M.99 27.8 ms write random 1MB object (30B key) 99% basic.writeBw1M 215.9 MB/s bandwidth writing 1MB objects (30B key) # RAMCloud multiRead performance for 100 B objects with 30 byte keys # located on a single master. # Generated by 'clusterperf.py multiRead_oneMaster' # # Num Objs Num Masters Objs/Master Latency (us) Latency/Obj (us) #---------------------------------------------------------------------------- 1 1 1 5.2 5.17 2 1 2 6.5 3.25 3 1 3 7.0 2.33 4 1 4 8.2 2.06 5 1 5 8.9 1.77 6 1 6 9.8 1.64 7 1 7 10.5 1.50 8 1 8 10.9 1.36 9 1 9 11.4 1.26 10 1 10 11.9 1.19 20 1 20 17.9 0.90 30 1 30 19.3 0.64 40 1 40 22.7 0.57 50 1 50 23.2 0.46 60 1 60 26.1 0.43 70 1 70 30.4 0.43 80 1 80 32.7 0.41 90 1 90 32.7 0.36 100 1 100 38.3 0.38 200 1 200 55.1 0.28 300 1 300 78.6 0.26 400 1 400 102.3 0.26 500 1 500 125.9 0.25 600 1 600 151.3 0.25 700 1 700 174.8 0.25 800 1 800 199.7 0.25 900 1 900 226.6 0.25 1000 1 1000 249.4 0.25 2000 1 2000 489.2 0.24 3000 1 3000 734.8 0.24 4000 1 4000 969.7 0.24 5000 1 5000 1206.1 0.24 # RAMCloud multiRead performance for 100 B objects with 30 byte keys # with one object located on each master. # Generated by 'clusterperf.py multiRead_oneObjectPerMaster' # # Num Objs Num Masters Objs/Master Latency (us) Latency/Obj (us) #---------------------------------------------------------------------------- 1 1 1 5.2 5.18 2 2 1 6.2 3.10 3 3 1 7.2 2.39 4 4 1 8.0 2.01 5 5 1 9.1 1.82 6 6 1 10.0 1.66 7 7 1 10.7 1.53 8 8 1 12.0 1.49 9 9 1 13.1 1.46 10 10 1 14.0 1.40 11 11 1 17.2 1.56 12 12 1 18.2 1.51 13 13 1 19.0 1.46 14 14 1 19.6 1.40 15 15 1 21.8 1.45 16 16 1 23.1 1.45 17 17 1 24.5 1.44 18 18 1 26.9 1.49 19 19 1 27.4 1.44 # RAMCloud multi-read throughput of a single server with a # varying number of clients issuing 80-object multi-reads on # randomly-chosen 100-byte objects with 30-byte keys # Generated by 'clusterperf.py multiReadThroughput' # # numClients throughput worker dispatch # (kreads/sec) utiliz. utiliz. #------------------------------------------------ 1 2453 1.413 0.051 2 4375 2.606 0.176 3 4668 2.902 0.225 4 5042 2.929 0.227 5 5045 2.938 0.215 6 5099 2.929 0.221 7 4864 2.929 0.221 8 5248 2.928 0.261 9 5126 2.913 0.266 10 4333 2.907 0.232 11 4478 2.910 0.241 12 4787 2.915 0.260 13 4490 2.916 0.243 14 4576 2.905 0.272 # RAMCloud multiWrite performance for 100 B objects with 30 byte keys # located on a single master. # Generated by 'clusterperf.py multiWrite_oneMaster' # # Num Objs Num Masters Objs/Master Latency (us) Latency/Obj (us) #---------------------------------------------------------------------------- 1 1 1 13.6 13.59 2 1 2 15.8 7.92 3 1 3 17.1 5.71 4 1 4 18.7 4.68 5 1 5 20.4 4.08 6 1 6 22.0 3.66 7 1 7 23.2 3.31 8 1 8 24.9 3.11 9 1 9 26.1 2.90 10 1 10 27.8 2.78 20 1 20 42.8 2.14 30 1 30 51.8 1.73 40 1 40 68.9 1.72 50 1 50 306.5 6.13 60 1 60 234.0 3.90 70 1 70 94.1 1.34 80 1 80 362.4 4.53 90 1 90 120.6 1.34 100 1 100 355.8 3.56 200 1 200 214.4 1.07 300 1 300 1744.1 5.81 400 1 400 435.4 1.09 500 1 500 3480.8 6.96 600 1 600 590.7 0.98 700 1 700 8075.3 11.54 800 1 800 813.7 1.02 900 1 900 7550.7 8.39 1000 1 1000 1075.9 1.08 2000 1 2000 11156.6 5.58 3000 1 3000 3167.5 1.06 4000 1 4000 34111.5 8.53 5000 1 5000 5310.6 1.06 # Cumulative distribution of time for a single client to read a # random 100-byte object from a single server. Each line indicates # that a given fraction of all reads took at most a given time # to complete. # Generated by 'clusterperf.py readDist' # # Time (usec) Cum. Fraction #--------------------------- 0.00 0.000 4.26 0.000 4.41 0.010 4.43 0.020 4.43 0.030 4.44 0.040 4.44 0.050 4.45 0.060 4.45 0.070 4.45 0.080 4.46 0.090 4.46 0.100 4.46 0.110 4.46 0.120 4.47 0.130 4.47 0.140 4.47 0.150 4.47 0.160 4.47 0.170 4.47 0.180 4.48 0.190 4.48 0.200 4.48 0.210 4.48 0.220 4.48 0.230 4.48 0.240 4.49 0.250 4.49 0.260 4.49 0.270 4.49 0.280 4.50 0.290 4.50 0.300 4.50 0.310 4.50 0.320 4.50 0.330 4.51 0.340 4.51 0.350 4.51 0.360 4.51 0.370 4.51 0.380 4.52 0.390 4.52 0.400 4.52 0.410 4.52 0.420 4.52 0.430 4.52 0.440 4.53 0.450 4.53 0.460 4.53 0.470 4.53 0.480 4.53 0.490 4.54 0.500 4.54 0.510 4.54 0.520 4.54 0.530 4.55 0.540 4.55 0.550 4.55 0.560 4.55 0.570 4.56 0.580 4.56 0.590 4.56 0.600 4.56 0.610 4.57 0.620 4.57 0.630 4.57 0.640 4.58 0.650 4.58 0.660 4.59 0.670 4.59 0.680 4.60 0.690 4.60 0.700 4.60 0.710 4.61 0.720 4.62 0.730 4.62 0.740 4.63 0.750 4.64 0.760 4.64 0.770 4.65 0.780 4.66 0.790 4.68 0.800 4.70 0.810 4.72 0.820 4.77 0.830 5.01 0.840 5.06 0.850 5.09 0.860 5.12 0.870 5.15 0.880 5.17 0.890 5.20 0.900 5.22 0.910 5.26 0.920 5.36 0.930 5.64 0.940 5.70 0.950 5.73 0.960 5.76 0.970 5.80 0.980 5.87 0.990 8.34 0.999 99.88 0.9999 115.64 1.000 # RAMCloud read throughput of a single server with a varying # number of clients issuing individual reads on randomly # chosen 100-byte objects with 30-byte keys # Generated by 'clusterperf.py readThroughput' # # numClients throughput worker dispatch # (kreads/sec) utiliz. utiliz. #------------------------------------------------ 1 213 0.154 0.032 2 404 0.325 0.141 3 595 0.480 0.330 4 757 0.640 0.602 5 873 0.762 0.809 6 952 0.838 0.928 7 1005 0.892 0.976 8 1033 0.917 0.999 9 1046 0.923 1.000 10 1061 0.934 1.000 11 1072 0.944 1.000 12 1068 0.940 1.000 13 1065 0.937 1.000 14 1056 0.929 1.000 # RAMCloud read performance for 100 B objects # with keys of various lengths. # Generated by 'clusterperf.py readVaryingKeyLength' # # Key Length Latency (us) Bandwidth (MB/s) #---------------------------------------------------------------------------- 1 4.3 22.4 5 4.3 23.3 10 4.3 24.4 15 4.3 25.3 20 4.3 26.4 25 4.4 27.4 30 4.3 29.0 35 4.4 29.1 40 4.4 30.1 45 4.4 31.2 50 4.4 32.2 55 4.5 33.2 60 4.5 34.3 65 4.4 35.4 70 4.5 36.2 75 4.5 36.8 80 4.5 38.0 85 4.5 38.8 90 4.5 39.9 95 4.5 40.9 100 4.6 41.4 200 5.4 53.5 300 5.5 69.1 400 5.5 87.4 500 5.6 101.5 600 5.8 115.6 700 5.9 129.6 800 6.0 142.5 900 6.2 154.1 1000 6.3 166.0 2000 7.0 285.2 3000 7.8 381.3 4000 8.5 460.3 5000 9.4 518.1 6000 10.1 574.0 7000 10.8 625.9 8000 11.5 669.6 9000 12.5 696.5 10000 14.1 684.0 20000 22.2 862.1 30000 30.5 942.6 40000 41.2 927.6 50000 49.9 957.2 60000 58.4 981.0 # RAMCloud write performance for 100 B objects # with keys of various lengths. # Generated by 'clusterperf.py writeVaryingKeyLength' # # Key Length Latency (us) Bandwidth (MB/s) #---------------------------------------------------------------------------- 1 14.2 6.8 5 14.1 7.1 10 14.2 7.4 15 14.2 7.7 20 14.3 8.0 25 14.3 8.3 30 14.3 8.7 35 14.4 9.0 40 14.4 9.3 45 14.9 9.3 50 15.0 9.6 55 15.0 9.9 60 15.1 10.1 65 14.9 10.6 70 14.8 10.9 75 14.9 11.2 80 14.9 11.6 85 14.9 11.8 90 15.1 12.0 95 15.1 12.3 100 15.1 12.6 200 15.7 18.2 300 16.1 23.6 400 16.8 28.4 500 17.1 33.4 600 17.6 37.9 700 18.7 40.9 800 19.7 43.6 900 20.5 46.6 1000 20.7 50.6 2000 24.8 80.7 3000 29.3 100.8 4000 33.9 115.3 5000 38.6 126.0 6000 43.2 134.6 7000 47.0 144.2 8000 48.0 161.1 9000 52.9 164.0 10000 57.3 168.0 20000 109.6 174.8 30000 158.3 181.4 40000 211.1 181.2 50000 262.2 182.3 60000 312.0 183.7 # RAMCLOUD index write, overwrite, lookup+readHashes, and IndexLookup class performance with a varying number of objects # and a B+ tree fanout of 16. All keys are 30 bytes and the value of the object is fixed to be 100 bytes. Read and Overwrite # latencies are measured by randomly reading and overwriting 1000 objects already inserted into the index at size n. # In a similar fashion, write latencies are measured by deleting an existing object and then immediately re-writing. # All latency measurements are printed as 10th percentile/ median/ 90th percentile. # # Generated by 'clusterperf.py indexBasic' # # n write latency(us) overwrite latency(us) hash lookup(us) lookup+read(us) IndexLookup(us) IndexLookup overhead #-------------------------------------------------------------------------------------------------------------------------------------------------------- 1 27.1/ 28.3/ 30.5 29.0/ 30.0/ 31.5 4.8/ 4.9/ 6.0 9.5/ 9.6/ 10.8 9.6/ 9.7/ 10.9 0.14/ 0.15/ 0.15 10 29.4/ 30.6/ 32.3 29.7/ 30.7/ 32.4 4.8/ 4.9/ 5.9 9.5/ 9.6/ 10.8 9.6/ 9.7/ 10.9 0.13/ 0.14/ 0.13 100 29.8/ 30.9/ 33.0 30.5/ 31.6/ 37.5 5.2/ 5.3/ 6.4 9.9/ 10.1/ 11.3 10.0/ 10.2/ 11.3 0.08/ 0.09/ 0.05 1000 30.3/ 31.5/ 33.5 30.9/ 32.0/ 41.4 5.6/ 5.8/ 6.8 10.4/ 10.6/ 11.7 10.4/ 10.6/ 11.7 0.03/ -0.04/ 0.00 10000 30.6/ 31.7/ 33.6 31.1/ 32.1/ 40.5 6.0/ 6.1/ 7.2 10.7/ 11.0/ 12.1 10.8/ 10.9/ 12.1 0.04/ -0.04/ -0.02 100000 31.2/ 32.3/ 34.0 31.6/ 32.7/ 37.3 6.4/ 6.6/ 7.6 11.3/ 11.6/ 12.7 11.2/ 11.4/ 12.5 -0.09/ -0.19/ -0.14 1000000 31.7/ 32.8/ 34.8 32.1/ 33.1/ 37.1 6.8/ 6.9/ 8.0 11.7/ 12.0/ 13.1 11.6/ 11.8/ 12.8 -0.16/ -0.29/ -0.29 # RAMCloud write/overwrite performance for random object # insertion with a varying number of secondary keys. The # IndexBtree fanout is 16 and the size of the table is fixed # at 1000 objects, each with a 100 byte value, a 30 byte primary # key, and an additional 30 bytes per secondary key. After # filling the table with 1000 objects in a random order, this test # will overwrite, erase, and re-write 1000 pre-exiting objects to # measure latency. The latency measurements are printed as # 10th percentile/ median/ 90th percentile # # Generated by 'clusterperf.py indexMultiple' # # Sec. keys/obj write latency (us) overwrite latency (us) #---------------------------------------------------------------------- 0 12.6/ 13.2/ 14.4 13.2/ 13.7/ 14.7 1 29.7/ 30.8/ 32.5 30.5/ 31.4/ 32.8 2 32.3/ 33.5/ 35.4 32.8/ 33.8/ 35.2 3 32.6/ 33.6/ 35.4 33.1/ 34.1/ 35.7 4 33.5/ 34.6/ 36.7 34.0/ 35.0/ 36.4 5 34.6/ 35.9/ 38.3 35.0/ 36.2/ 38.2 6 37.0/ 38.4/ 41.5 37.5/ 38.9/ 41.3 7 37.0/ 38.1/ 40.8 37.2/ 38.3/ 40.5 8 39.5/ 40.9/ 44.2 40.1/ 41.4/ 44.3 9 39.7/ 41.2/ 45.0 40.1/ 41.5/ 44.6 10 40.8/ 42.1/ 45.1 40.9/ 41.9/ 44.5 # RAMCloud transaction performance for 100 B objects with 30 byte keys # located on a single master. # Generated by 'clusterperf.py transaction_oneMaster' # # Num Objs Num Masters Objs/Master WriteObjs/Master Latency (us) Latency/Obj (us) #------------------------------------------------------------------------------------------ 1 1 1 1 19.2 19.19 2 1 2 2 20.8 10.39 3 1 3 3 24.1 8.04 4 1 4 4 27.9 6.97 5 1 5 5 31.3 6.25 6 1 6 6 34.6 5.77 7 1 7 7 37.6 5.38 8 1 8 8 42.0 5.25 9 1 9 9 47.0 5.22 10 1 10 10 51.8 5.18 20 1 20 20 85.3 4.27 30 1 30 30 122.3 4.08