clusterperf September 29, 2014
clusterperf output as measured on September 29, 2014 (rc1-rc20)
Recent changes that may have affected performance:
- Cache prefetching improved latency for small RPCs
- Multi-op scheduler rewritten: better throughput with multiple servers
- InfRc optimized to retrieve multiple completions at once and use zeroCopy for results as well as requests
- Use random reads in most clusterperf tests now (e.g., "basic" used to read a single object, now it randomizes)
basic.read100 4.6 us read random 100B object (30B key) median basic.read100.min 4.4 us read random 100B object (30B key) minimum basic.read100.9 5.4 us read random 100B object (30B key) 90% basic.read100.99 7.2 us read random 100B object (30B key) 99% basic.read100.999 72.2 us read random 100B object (30B key) 99.9% basic.readBw100 18.8 MB/s bandwidth reading 100B objects (30B key) basic.read1K 7.0 us read random 1KB object (30B key) median basic.read1K.min 6.1 us read random 1KB object (30B key) minimum basic.read1K.9 7.7 us read random 1KB object (30B key) 90% basic.read1K.99 9.4 us read random 1KB object (30B key) 99% basic.read1K.999 72.5 us read random 1KB object (30B key) 99.9% basic.readBw1K 130.1 MB/s bandwidth reading 1KB objects (30B key) basic.read10K 10.1 us read random 10KB object (30B key) median basic.read10K.min 9.0 us read random 10KB object (30B key) minimum basic.read10K.9 11.1 us read random 10KB object (30B key) 90% basic.read10K.99 12.1 us read random 10KB object (30B key) 99% basic.read10K.999 14.6 us read random 10KB object (30B key) 99.9% basic.readBw10K 925.1 MB/s bandwidth reading 10KB objects (30B key) basic.read100K 42.8 us read random 100KB object (30B key) median basic.read100K.min 36.6 us read random 100KB object (30B key) minimum basic.read100K.9 44.0 us read random 100KB object (30B key) 90% basic.read100K.99 45.3 us read random 100KB object (30B key) 99% basic.read100K.999 108.5 us read random 100KB object (30B key) 99.9% basic.readBw100K 2.2 GB/s bandwidth reading 100KB objects (30B key) basic.read1M 357.7 us read random 1MB object (30B key) median basic.read1M.min 320.4 us read random 1MB object (30B key) minimum basic.read1M.9 363.8 us read random 1MB object (30B key) 90% basic.read1M.99 381.2 us read random 1MB object (30B key) 99% basic.read1M.999 454.1 us read random 1MB object (30B key) 99.9% basic.readBw1M 2.6 GB/s bandwidth reading 1MB objects (30B key) basic.write100 15.0 us write random 100B object (30B key) median basic.write100.min 13.6 us write random 100B object (30B key) minimum basic.write100.9 16.2 us write random 100B object (30B key) 90% basic.write100.99 41.5 us write random 100B object (30B key) 99% basic.write100.999 153.9 us write random 100B object (30B key) 99.9% basic.writeBw100 5.9 MB/s bandwidth writing 100B objects (30B key) basic.write1K 19.4 us write random 1KB object (30B key) median basic.write1K.min 17.9 us write random 1KB object (30B key) minimum basic.write1K.9 20.8 us write random 1KB object (30B key) 90% basic.write1K.99 103.5 us write random 1KB object (30B key) 99% basic.write1K.999 175.8 us write random 1KB object (30B key) 99.9% basic.writeBw1K 44.1 MB/s bandwidth writing 1KB objects (30B key) basic.write10K 35.3 us write random 10KB object (30B key) median basic.write10K.min 32.5 us write random 10KB object (30B key) minimum basic.write10K.9 37.7 us write random 10KB object (30B key) 90% basic.write10K.99 208.5 us write random 10KB object (30B key) 99% basic.write10K.999 287.3 us write random 10KB object (30B key) 99.9% basic.writeBw10K 233.2 MB/s bandwidth writing 10KB objects (30B key) basic.write100K 228.2 us write random 100KB object (30B key) median basic.write100K.min 209.3 us write random 100KB object (30B key) minimum basic.write100K.9 311.3 us write random 100KB object (30B key) 90% basic.write100K.99 425.7 us write random 100KB object (30B key) 99% basic.write100K.999 489.2 us write random 100KB object (30B key) 99.9% basic.writeBw100K 383.1 MB/s bandwidth writing 100KB objects (30B key) basic.write1M 2.2 ms write random 1MB object (30B key) median basic.write1M.min 2.1 ms write random 1MB object (30B key) minimum basic.write1M.9 2.3 ms write random 1MB object (30B key) 90% basic.write1M.99 2.4 ms write random 1MB object (30B key) 99% basic.write1M.999 2.7 ms write random 1MB object (30B key) 99.9% basic.writeBw1M 431.2 MB/s bandwidth writing 1MB objects (30B key) # RAMCloud multiRead performance for 100 B objects with 30 byte keys # located on a single master. # Generated by 'clusterperf.py multiRead_oneMaster' # # Num Objs Num Masters Objs/Master Latency (us) Latency/Obj (us) #---------------------------------------------------------------------------- 1 1 1 5.5 5.46 2 1 2 6.7 3.34 3 1 3 7.8 2.59 4 1 4 8.5 2.11 5 1 5 9.4 1.89 6 1 6 10.6 1.77 7 1 7 11.1 1.59 8 1 8 11.7 1.47 9 1 9 12.6 1.40 10 1 10 13.7 1.37 20 1 20 20.2 1.01 30 1 30 27.1 0.90 40 1 40 33.7 0.84 50 1 50 40.3 0.81 60 1 60 42.9 0.71 70 1 70 49.1 0.70 80 1 80 55.3 0.69 90 1 90 54.9 0.61 100 1 100 56.1 0.56 200 1 200 75.1 0.38 300 1 300 94.7 0.32 400 1 400 112.2 0.28 500 1 500 129.5 0.26 600 1 600 149.5 0.25 700 1 700 168.4 0.24 800 1 800 188.7 0.24 900 1 900 217.4 0.24 1000 1 1000 218.6 0.22 2000 1 2000 431.2 0.22 3000 1 3000 648.7 0.22 4000 1 4000 869.4 0.22 5000 1 5000 1095.2 0.22 # RAMCloud multiRead performance for 100 B objects with 30 byte keys # with one object located on each master. # Generated by 'clusterperf.py multiRead_oneObjectPerMaster' # # Num Objs Num Masters Objs/Master Latency (us) Latency/Obj (us) #---------------------------------------------------------------------------- 1 1 1 5.3 5.30 2 2 1 6.8 3.40 3 3 1 7.6 2.54 4 4 1 9.0 2.24 5 5 1 10.2 2.04 6 6 1 11.7 1.96 7 7 1 13.1 1.86 8 8 1 14.9 1.86 9 9 1 15.1 1.68 10 10 1 17.0 1.70 11 11 1 20.7 1.88 12 12 1 24.3 2.02 13 13 1 24.2 1.86 14 14 1 26.0 1.86 15 15 1 27.6 1.84 16 16 1 31.8 1.99 17 17 1 31.1 1.83 18 18 1 34.5 1.92 19 19 1 34.9 1.84 20 20 1 37.4 1.87 # RAMCloud multi-read throughput of a single server with a # varying number of clients issuing 70-object multi-reads on # randomly-chosen 100-byte objects with 30-byte keys # Generated by 'clusterperf.py multiReadThroughput' # # numClients throughput worker utiliz. # (kreads/sec) #------------------------------------------- 1 1475 0.424 2 2516 0.980 3 3386 1.531 4 4270 1.959 5 4968 2.597 6 5518 2.834 7 6189 2.947 8 6203 2.944 9 5064 2.802 10 4914 2.710 11 5536 2.826 12 5616 2.825 13 6040 2.914 14 5095 2.770 15 4885 2.708 # RAMCloud multiWrite performance for 100 B objects with 30 byte keys # located on a single master. # Generated by 'clusterperf.py multiWrite_oneMaster' # # Num Objs Num Masters Objs/Master Latency (us) Latency/Obj (us) #---------------------------------------------------------------------------- 1 1 1 16.7 16.69 2 1 2 19.6 9.80 3 1 3 22.0 7.32 4 1 4 23.2 5.81 5 1 5 24.3 4.86 6 1 6 26.6 4.43 7 1 7 28.2 4.03 8 1 8 30.2 3.78 9 1 9 32.0 3.56 10 1 10 32.3 3.23 20 1 20 51.2 2.56 30 1 30 62.5 2.08 40 1 40 79.2 1.98 50 1 50 94.2 1.88 60 1 60 219.1 3.65 70 1 70 122.6 1.75 80 1 80 136.9 1.71 90 1 90 143.8 1.60 100 1 100 141.8 1.42 200 1 200 3983.5 19.92 300 1 300 1072.0 3.57 400 1 400 375.1 0.94 500 1 500 5551.4 11.10 600 1 600 558.6 0.93 700 1 700 4498.6 6.43 800 1 800 802.7 1.00 900 1 900 6116.5 6.80 1000 1 1000 898.4 0.90 2000 1 2000 28052.8 14.03 3000 1 3000 6901.3 2.30 4000 1 4000 9208.7 2.30 5000 1 5000 12261.8 2.45 # Cumulative distribution of time for a single client to read a # random 100-byte object from a single server. Each line indicates # that a given fraction of all reads took at most a given time # to complete. # Generated by 'clusterperf.py readDist' # # Time (usec) Cum. Fraction #--------------------------- 0.00 0.000 4.41 0.000 4.52 0.010 4.54 0.020 4.55 0.030 4.56 0.040 4.57 0.050 4.58 0.060 4.59 0.070 4.59 0.080 4.60 0.090 4.60 0.100 4.61 0.110 4.61 0.120 4.62 0.130 4.62 0.140 4.62 0.150 4.63 0.160 4.63 0.170 4.63 0.180 4.64 0.190 4.64 0.200 4.64 0.210 4.65 0.220 4.65 0.230 4.65 0.240 4.66 0.250 4.66 0.260 4.66 0.270 4.67 0.280 4.67 0.290 4.67 0.300 4.68 0.310 4.68 0.320 4.68 0.330 4.69 0.340 4.69 0.350 4.69 0.360 4.69 0.370 4.70 0.380 4.70 0.390 4.70 0.400 4.70 0.410 4.71 0.420 4.71 0.430 4.71 0.440 4.71 0.450 4.72 0.460 4.72 0.470 4.72 0.480 4.73 0.490 4.73 0.500 4.73 0.510 4.73 0.520 4.74 0.530 4.74 0.540 4.74 0.550 4.75 0.560 4.75 0.570 4.75 0.580 4.76 0.590 4.76 0.600 4.76 0.610 4.77 0.620 4.77 0.630 4.77 0.640 4.78 0.650 4.78 0.660 4.79 0.670 4.79 0.680 4.80 0.690 4.80 0.700 4.81 0.710 4.81 0.720 4.82 0.730 4.82 0.740 4.83 0.750 4.84 0.760 4.85 0.770 4.86 0.780 4.88 0.790 4.90 0.800 4.92 0.810 4.97 0.820 5.12 0.830 5.24 0.840 5.28 0.850 5.32 0.860 5.35 0.870 5.37 0.880 5.40 0.890 5.43 0.900 5.47 0.910 5.54 0.920 5.70 0.930 5.79 0.940 5.84 0.950 5.89 0.960 5.94 0.970 6.00 0.980 6.22 0.990 10.28 0.999 97.88 0.9999 112.01 1.000 # RAMCloud read throughput of a single server with a varying # number of clients issuing individual reads on randomly # chosen 100-byte objects with 30-byte keys # Generated by 'clusterperf.py readThroughput' # # numClients throughput worker utiliz. # (kreads/sec) #------------------------------------------- 1 202 0.145 2 386 0.289 3 556 0.455 4 698 0.597 5 705 0.589 6 774 0.646 7 836 0.716 8 866 0.735 9 926 0.797 10 895 0.807 11 943 0.804 12 955 0.815 13 914 0.786 14 915 0.776 15 941 0.798 # RAMCloud read performance for 100 B objects # with keys of various lengths. # Generated by 'clusterperf.py readVaryingKeyLength' # # Key Length Latency (us) Bandwidth (MB/s) #---------------------------------------------------------------------------- 1 4.4 21.9 5 4.4 22.8 10 4.4 23.7 15 4.5 24.6 20 4.4 25.8 25 4.5 26.7 30 4.4 28.1 35 4.5 28.4 40 4.5 29.4 45 4.5 30.4 50 4.6 31.4 55 4.6 32.3 60 4.6 33.4 65 4.6 34.3 70 4.6 35.4 75 4.6 36.1 80 4.6 37.1 85 4.6 38.0 90 4.6 39.1 95 4.7 39.8 100 4.7 40.8 200 5.4 52.8 300 5.6 68.1 400 6.0 79.2 500 6.2 92.5 600 6.3 106.1 700 6.4 118.4 800 6.6 130.6 900 6.7 142.5 1000 6.8 153.3 2000 7.5 266.4 3000 8.2 362.0 4000 8.8 444.3 5000 9.6 507.0 6000 10.3 567.5 7000 10.9 619.4 8000 11.6 667.5 9000 12.4 701.5 10000 13.1 732.7 20000 21.4 897.4 30000 29.4 977.8 40000 39.7 964.5 50000 47.9 996.7 60000 55.9 1024.7 # RAMCloud write performance for 100 B objects # with keys of various lengths. # Generated by 'clusterperf.py writeVaryingKeyLength' # # Key Length Latency (us) Bandwidth (MB/s) #---------------------------------------------------------------------------- 1 14.2 6.8 5 14.1 7.1 10 14.2 7.4 15 14.3 7.7 20 14.3 8.0 25 14.3 8.3 30 14.4 8.6 35 14.5 8.9 40 14.5 9.2 45 14.5 9.5 50 14.5 9.9 55 14.9 9.9 60 14.7 10.4 65 14.6 10.8 70 14.7 11.0 75 14.7 11.3 80 14.7 11.7 85 15.2 11.6 90 15.5 11.7 95 15.6 11.9 100 15.6 12.3 200 16.1 17.8 300 16.9 22.5 400 17.5 27.2 500 18.0 31.8 600 18.4 36.4 700 18.9 40.4 800 19.9 43.2 900 21.1 45.2 1000 21.5 48.8 2000 25.0 80.2 3000 28.7 102.9 4000 32.5 120.3 5000 36.1 134.6 6000 39.9 145.7 7000 43.7 154.8 8000 47.5 162.7 9000 52.1 166.5 10000 56.3 171.1 20000 100.6 190.5 30000 144.2 199.1 40000 192.0 199.2 50000 236.9 201.7 60000 280.4 204.4