Metrics Files
Here are some representative metrics files to consult and compare against as the system evolves.
SOSP '11 Paper Experiments
Note that all data files (full logs, not just metrics) are in the paper repo.
Figure 8 (Proportional Scaling, 1 to 20 masters, 6 backup processes per master)
Note that the paper results were averages of 5 runs.
1 Master, 6 backups:
=== Summary ===
Recovery time: 0.985 s
Masters: 1
Backups: 6
Replicas: 3
Objects per master: 592415
Object size: 1024 bytes
Total live objects: 592415
Total recovery segment entries: 592415
Total live object space: 578.53 MB
Total recovery segment space w/ overhead: 600.00 MB
Storage type: disk
Log directory: /home/rumble/ramcloud/recovery/20110826170432
=== Coordinator Time ===
Total: 982.4 ms / 99.75% of total recovery
Starting recovery on backups: 0.7 ms / 0.07% of total recovery
Starting recovery on masters: 1.8 ms / 0.18% of total recovery
Tablets recovered: 0.3 ms / 0.03% of total recovery
Completing recovery on backups: 0.1 ms / 0.01% of total recovery
Set will: 0.0 ms / 0.00% of total recovery
Get tablet map: 19.9 ms / 2.02% of total recovery
Other: 959.8 ms / 97.45% of total recovery
Receiving in transport: 13.2 ms / 1.34% of total recovery
=== Master Time ===
Total: 0.0 ms / 0.00% of total recovery
Waiting for incoming segments: 457.4 ms / 46.44% of total recovery
Inside recoverSegment: 504.7 ms / 51.24% of total recovery
Backup.proceed: 8.5 ms / 0.86% of total recovery
Verify checksum: 0.0 ms / 0.00% of total recovery
Segment append: 322.2 ms / 32.72% of total recovery
Segment append copy: 51.4 ms / 5.21% of total recovery
Segment append checksum: 147.3 ms / 14.96% of total recovery
Other (HT, etc.): 174.0 ms / 17.66% of total recovery (other)
Final log sync: 9.7 ms / 0.98% of total recovery
Removing tombstones: 0.0 ms / 0.00% of total recovery
Other: -971.8 ms / -98.67% of total recovery
Receiving in transport: 3.7 ms / 0.38% of total recovery
Transmitting in transport: 29.2 ms / 2.97% of total recovery
Replicating one segment: 1.2 ms
During replay: 1.2 ms
During log sync: 0.8 ms
RPC latency replicating one segment: 2.1 ms (for R-th replica)
During replay: 2.1 ms (for R-th replica)
During log sync: 1.2 ms (for R-th replica)
Replication: 847.2 ms / 86.01% of total recovery
Client RPCs Active: 873.7 ms / 88.71% of total recovery
Average GRD completion time: 42.0 ms
Task iterations: 2789253
=== Backup Time ===
RPC service time: 87.7 ms avg / max 111.3 ms / 8.90% avg of total recovery
startReadingData: 0.1 ms avg / max 0.1 ms / 0.01% avg of total recovery
Open/write segment: 82.5 ms avg / max 105.9 ms / 8.38% avg of total recovery
Open segment memset: 0.0 ms avg / max 0.0 ms / 0.00% avg of total recovery
Copy: 82.1 ms avg / max 105.3 ms / 8.34% avg of total recovery
Other: 0.4 ms avg / max 0.5 ms / 0.04% avg of total recovery
getRecoveryData: 2.3 ms avg / max 4.4 ms / 0.23% avg of total recovery
Other: 2.8 ms avg / max 3.2 ms / 0.29% avg of total recovery
Transmitting in transport: 24.1 ms avg / max 29.2 ms / 2.45% avg of total recovery
Filtering segments: 54.0 ms avg / max 60.7 ms / 5.48% avg of total recovery
Reading segments: 866.8 ms avg / max 957.4 ms / 88.01% avg of total recovery
Using disk: 864.0 ms avg / max 954.7 ms / 87.72% avg of total recovery
getRecoveryData completions: 13 avg / max 14
getRecoveryData retry fraction: 0.798 avg / max 0.959
=== Efficiency ===
recoverSegment CPU: 6.64 ms avg (per filtered segment)
Writing a segment: 0.49 ms avg (backup RPC thread)
Filtering a segment: 4.32 ms avg
=== Network Utilization ===
Aggregate: 34.15 Gb/s / 17.79% of network capacity (overall)
Master in: 7.42 Gb/s (overall)
Master out: 15.10 Gb/s (overall)
Master out during replication: 17.56 Gb/s (overall)
Master out during log sync: 19.49 Gb/s (overall)
Backup in: 3.17 Gb/s avg / min 2.03 Gb/s / 19.04 Gb/s total (overall)
Backup out: 3.17 Gb/s avg / min 0.70 Gb/s / 19.04 Gb/s total (overall)
=== Disk Utilization ===
Effective bandwidth: 116.42 MB/s avg / min 105.59 MB/s / 698.54 MB/s total
Active bandwidth: 122.46 MB/s avg / min 113.82 MB/s / 734.74 MB/s total
Reading: 115.51 MB/s avg / min 107.07 MB/s / 693.07 MB/s total
Writing: 169.53 MB/s avg / min 141.75 MB/s / 678.11 MB/s total
Disk active: 95.13% avg / max 98.87% (of total recovery)
Reading: 87.72% avg / max 96.93% (of total recovery)
Writing: 7.41% avg / max 17.19% (of total recovery)
=== Backup Events ===
Segments read: 12.5 avg / max 14.0
Primary segments loaded: 12.7 avg / max 14.0
Secondary segments loaded: 0.0 avg / max 0.0
=== Slowest Servers ===
Backup opens, writes: server.rc01 (284.5 ms)
Stalled reading segs from backups: server.rc01 (457.4 ms)
Reading from disk: backup.rc03 (107.1 MB/s)
Writing to disk: backup.rc01 (141.7 MB/s)
20 Masters, 120 Backups:
=== Summary ===
Recovery time: 1.118 s
Masters: 20
Backups: 120
Replicas: 3
Objects per master: 592415.0 avg / max 592415.0
Object size: 1024 bytes avg / max 1024 bytes
Total live objects: 11848300
Total recovery segment entries: 11848300
Total live object space: 11570.61 MB
Total recovery segment space w/ overhead: 11999.98 MB
Storage type: disk
Log directory: /home/rumble/ramcloud/recovery/20110826174510
=== Coordinator Time ===
Total: 1115.9 ms / 99.80% of total recovery
Starting recovery on backups: 11.7 ms / 1.05% of total recovery
Starting recovery on masters: 7.0 ms / 0.63% of total recovery
Tablets recovered: 2.3 ms / 0.21% of total recovery
Completing recovery on backups: 0.8 ms / 0.07% of total recovery
Set will: 0.0 ms / 0.00% of total recovery
Get tablet map: 145.3 ms / 13.00% of total recovery
Other: 949.6 ms / 84.92% of total recovery
Receiving in transport: 6.4 ms / 0.57% of total recovery
=== Master Time ===
Total: 1041.1 ms avg / max 1101.7 ms / 93.11% avg of total recovery
Waiting for incoming segments: 531.6 ms avg / max 552.4 ms / 47.54% avg of total recovery
Inside recoverSegment: 522.1 ms avg / max 540.8 ms / 46.69% avg of total recovery
Backup.proceed: 7.9 ms avg / max 8.9 ms / 0.70% avg of total recovery
Verify checksum: 0.0 ms avg / max 0.0 ms / 0.00% avg of total recovery
Segment append: 333.8 ms avg / max 349.2 ms / 29.85% avg of total recovery
Segment append copy: 51.3 ms avg / max 56.9 ms / 4.59% avg of total recovery
Segment append checksum: 153.6 ms avg / max 166.0 ms / 13.74% avg of total recovery
Other (HT, etc.): 180.4 ms avg / max 185.7 ms / 16.13% avg of total recovery (other)
Final log sync: 12.6 ms avg / max 16.7 ms / 1.12% avg of total recovery
Removing tombstones: 0.0 ms avg / max 0.0 ms / 0.00% avg of total recovery
Other: -25.2 ms avg / max 39.4 ms / -2.25% avg of total recovery
Receiving in transport: 6.5 ms avg / max 10.9 ms / 0.58% avg of total recovery
Transmitting in transport: 38.2 ms avg / max 42.2 ms / 3.41% avg of total recovery
Replicating one segment: 1.3 ms avg / max 1.3 ms
During replay: 1.3 ms avg / max 1.3 ms
During log sync: 1.1 ms avg / max 1.4 ms
RPC latency replicating one segment: 2.7 ms avg / max 2.8 ms (for R-th replica)
During replay: 2.7 ms avg / max 2.8 ms (for R-th replica)
During log sync: 1.5 ms avg / max 2.1 ms (for R-th replica)
Replication: 942.2 ms avg / max 955.5 ms / 84.26% avg of total recovery
Client RPCs Active: 985.7 ms avg / max 998.0 ms / 88.15% avg of total recovery
Average GRD completion time: 2.4 ms avg / max 2.4 ms
Task iterations: 3193584 avg / max 3519448
=== Backup Time ===
RPC service time: 80.0 ms avg / max 263.2 ms / 7.15% avg of total recovery
startReadingData: 0.1 ms avg / max 0.2 ms / 0.01% avg of total recovery
Open/write segment: 75.1 ms avg / max 131.8 ms / 6.72% avg of total recovery
Open segment memset: 0.0 ms avg / max 0.0 ms / 0.00% avg of total recovery
Copy: 74.7 ms avg / max 131.1 ms / 6.68% avg of total recovery
Other: 0.4 ms avg / max 0.7 ms / 0.04% avg of total recovery
getRecoveryData: 2.2 ms avg / max 158.2 ms / 0.20% avg of total recovery
Other: 2.6 ms avg / max 43.2 ms / 0.23% avg of total recovery
Transmitting in transport: 27.7 ms avg / max 42.2 ms / 2.48% avg of total recovery
Filtering segments: 68.5 ms avg / max 91.0 ms / 6.13% avg of total recovery
Reading segments: 845.2 ms avg / max 1087.4 ms / 75.59% avg of total recovery
Using disk: 842.1 ms avg / max 1083.0 ms / 75.31% avg of total recovery
getRecoveryData completions: 250 avg / max 300
getRecoveryData retry fraction: 0.063 avg / max 0.992
=== Efficiency ===
recoverSegment CPU: 0.35 ms avg (per filtered segment)
Writing a segment: 0.44 ms avg (backup RPC thread)
Filtering a segment: 5.48 ms avg
=== Network Utilization ===
Aggregate: 601.92 Gb/s / 15.67% of network capacity (overall)
Master in: 6.47 Gb/s avg / min 5.20 Gb/s / 129.34 Gb/s total (overall)
Master out: 13.30 Gb/s avg / min 13.08 Gb/s / 265.98 Gb/s total (overall)
Master out during replication: 15.78 Gb/s avg / min 15.57 Gb/s / 315.70 Gb/s total (overall)
Master out during log sync: 15.83 Gb/s avg / min 11.34 Gb/s / 316.68 Gb/s total (overall)
Backup in: 2.80 Gb/s avg / min 1.29 Gb/s / 335.45 Gb/s total (overall)
Backup out: 2.80 Gb/s avg / min 0.56 Gb/s / 335.42 Gb/s total (overall)
=== Disk Utilization ===
Effective bandwidth: 116.92 MB/s avg / min 78.70 MB/s / 14030.17 MB/s total
Active bandwidth: 122.00 MB/s avg / min 81.26 MB/s / 14640.46 MB/s total
Reading: 118.93 MB/s avg / min 73.87 MB/s / 14271.83 MB/s total
Writing: 135.00 MB/s avg / min 83.26 MB/s / 16064.48 MB/s total
Disk active: 95.84% avg / max 99.50% (of total recovery)
Reading: 75.31% avg / max 96.85% (of total recovery)
Writing: 20.53% avg / max 30.30% (of total recovery)
=== Backup Events ===
Segments read: 12.5 avg / max 15.0
Primary segments loaded: 250.3 avg / max 300.0
Secondary segments loaded: 0.0 avg / max 0.0
=== Slowest Servers ===
Backup opens, writes: server.rc09 (351.9 ms)
Stalled reading segs from backups: server.rc09 (552.4 ms)
Reading from disk: backup.rc22 (73.9 MB/s)
Writing to disk: backup.rc14 (83.3 MB/s)
Figure 9 (Proportional Scaling, 2 to 60 masters, 2 backup processes per master)
After rerunning a few of the data points from this graph it looks like we (unfortunately) scaled the cluster by adding processes on two machines at a time before starting to double up master/backup processes with backup processes on the same machine. Doubling up processes right away doesn't yield the same results as the paper even with the code from SOSP'11.
2 Masters, 4 Backups:
=== Summary ===
Recovery time: 1.290 s
Masters: 2
Backups: 4
Replicas: 3
Objects per master: 592415.0 avg / max 592415.0
Object size: 1024 bytes avg / max 1024 bytes
Total live objects: 1184830
Total recovery segment entries: 1184830
Total live object space: 1157.06 MB
Total recovery segment space w/ overhead: 1200.00 MB
Storage type: disk
Log directory: /home/rumble/ramcloud/recovery/20110830182651
=== Coordinator Time ===
Total: 1287.5 ms / 99.78% of total recovery
Starting recovery on backups: 3.5 ms / 0.27% of total recovery
Starting recovery on masters: 0.5 ms / 0.04% of total recovery
Tablets recovered: 0.3 ms / 0.03% of total recovery
Completing recovery on backups: 0.1 ms / 0.01% of total recovery
Set will: 0.0 ms / 0.00% of total recovery
Get tablet map: 26.1 ms / 2.02% of total recovery
Other: 1257.0 ms / 97.42% of total recovery
Receiving in transport: 14.7 ms / 1.14% of total recovery
=== Master Time ===
Total: 639.0 ms avg / max 1278.0 ms / 49.52% avg of total recovery
Waiting for incoming segments: 558.8 ms avg / max 572.9 ms / 43.31% avg of total recovery
Inside recoverSegment: 633.2 ms avg / max 647.1 ms / 49.07% avg of total recovery
Backup.proceed: 13.2 ms avg / max 13.6 ms / 1.02% avg of total recovery
Verify checksum: 0.0 ms avg / max 0.0 ms / 0.00% avg of total recovery
Segment append: 412.3 ms avg / max 419.6 ms / 31.95% avg of total recovery
Segment append copy: 56.9 ms avg / max 59.4 ms / 4.41% avg of total recovery
Segment append checksum: 189.3 ms avg / max 194.3 ms / 14.67% avg of total recovery
Other (HT, etc.): 207.7 ms avg / max 213.8 ms / 16.10% avg of total recovery (other)
Final log sync: 77.3 ms avg / max 80.8 ms / 5.99% avg of total recovery
Removing tombstones: 0.0 ms avg / max 0.0 ms / 0.00% avg of total recovery
Other: -630.3 ms avg / max 12.1 ms / -48.85% avg of total recovery
Receiving in transport: 8.3 ms avg / max 8.5 ms / 0.64% avg of total recovery
Transmitting in transport: 123.8 ms avg / max 126.3 ms / 9.59% avg of total recovery
Replicating one segment: 1.6 ms avg / max 1.6 ms
During replay: 1.6 ms avg / max 1.6 ms
During log sync: 1.2 ms avg / max 1.2 ms
RPC latency replicating one segment: 3.4 ms avg / max 3.4 ms (for R-th replica)
During replay: 3.5 ms avg / max 3.5 ms (for R-th replica)
During log sync: 2.4 ms avg / max 2.5 ms (for R-th replica)
Replication: 1227.0 ms avg / max 1230.7 ms / 95.09% avg of total recovery
Client RPCs Active: 1249.5 ms avg / max 1251.7 ms / 96.84% avg of total recovery
Average GRD completion time: 26.7 ms avg / max 26.8 ms
Task iterations: 1950173 avg / max 1971082
=== Backup Time ===
RPC service time: 407.2 ms avg / max 440.4 ms / 31.56% avg of total recovery
startReadingData: 0.1 ms avg / max 0.1 ms / 0.01% avg of total recovery
Open/write segment: 397.9 ms avg / max 433.3 ms / 30.84% avg of total recovery
Open segment memset: 0.1 ms avg / max 0.1 ms / 0.00% avg of total recovery
Copy: 395.8 ms avg / max 431.2 ms / 30.68% avg of total recovery
Other: 2.0 ms avg / max 2.1 ms / 0.16% avg of total recovery
getRecoveryData: 6.0 ms avg / max 7.8 ms / 0.46% avg of total recovery
Other: 3.3 ms avg / max 3.5 ms / 0.26% avg of total recovery
Transmitting in transport: 119.2 ms avg / max 126.3 ms / 9.24% avg of total recovery
Filtering segments: 345.2 ms avg / max 370.2 ms / 26.75% avg of total recovery
Reading segments: 1131.8 ms avg / max 1182.3 ms / 87.72% avg of total recovery
Using disk: 1116.9 ms avg / max 1132.1 ms / 86.56% avg of total recovery
getRecoveryData completions: 76 avg / max 78
getRecoveryData retry fraction: 0.774 avg / max 0.823
=== Efficiency ===
recoverSegment CPU: 4.19 ms avg (per filtered segment)
Writing a segment: 0.78 ms avg (backup RPC thread)
Filtering a segment: 9.21 ms avg
=== Network Utilization ===
Aggregate: 54.51 Gb/s / 42.58% of network capacity (overall)
Master in: 9.03 Gb/s avg / min 8.82 Gb/s / 18.07 Gb/s total (overall)
Master out: 12.72 Gb/s avg / min 12.69 Gb/s / 25.43 Gb/s total (overall)
Master out during replication: 13.32 Gb/s avg / min 13.31 Gb/s / 26.64 Gb/s total (overall)
Master out during log sync: 16.86 Gb/s avg / min 16.54 Gb/s / 33.72 Gb/s total (overall)
Backup in: 7.27 Gb/s avg / min 5.38 Gb/s / 29.07 Gb/s total (overall)
Backup out: 7.27 Gb/s avg / min 1.79 Gb/s / 29.07 Gb/s total (overall)
Master Infiniband TX Active: 1.2 s avg / min 1.2 s / 2.4 s total
=== Disk Utilization ===
Effective bandwidth: 257.30 MB/s avg / min 254.20 MB/s / 1029.20 MB/s total
Active bandwidth: 261.67 MB/s avg / min 259.48 MB/s / 1046.66 MB/s total
Reading: 268.59 MB/s avg / min 268.36 MB/s / 1074.38 MB/s total
Writing: 210.99 MB/s avg / min 198.68 MB/s / 843.94 MB/s total
Disk active: 98.33% avg / max 99.47% (of total recovery)
Reading: 86.56% avg / max 87.73% (of total recovery)
Writing: 11.77% avg / max 12.48% (of total recovery)
=== Backup Events ===
Segments read: 37.5 avg / max 38.0
Primary segments loaded: 75.5 avg / max 78.0
Secondary segments loaded: 0.0 avg / max 0.0
=== Slowest Servers ===
Backup opens, writes: server.rc02 (362.5 ms)
Stalled reading segs from backups: server.rc02 (572.9 ms)
Reading from disk: backup.rc02 (268.4 MB/s)
Writing to disk: backup.rc02 (198.7 MB/s)
10 Masters, 20 Backups:
=== Summary ===
Recovery time: 1.410 s
Masters: 10
Backups: 20
Total nodes: 20
Replicas: 3
Objects per master: 592415.0 avg / max 592415.0
Object size: 1024 bytes avg / max 1024 bytes
Total live objects: 5924150
Total recovery segment entries: 5924150
Total live object space: 5785.30 MB
Total recovery segment space w/ overhead: 5999.99 MB
Storage type: disk
Log directory: /home/stutsman/src/ramcloud-sosp/recovery/20121018083553
=== Coordinator Time ===
Total: 1407.7 ms / 99.85% of total recovery
Starting recovery on backups: 4.5 ms / 0.32% of total recovery
Starting recovery on masters: 2.2 ms / 0.16% of total recovery
Tablets recovered: 1.0 ms / 0.07% of total recovery
Completing recovery on backups: 0.2 ms / 0.01% of total recovery
Set will: 0.0 ms / 0.00% of total recovery
Get tablet map: 75.4 ms / 5.35% of total recovery
Other: 1324.7 ms / 93.96% of total recovery
Receiving in transport: 8.9 ms / 0.63% of total recovery
=== Master Time ===
Total: 1398.7 ms avg / max 1404.1 ms / 99.21% avg of total recovery
Waiting for incoming segments: 562.7 ms avg / max 585.6 ms / 39.91% avg of total recovery
Inside recoverSegment: 784.0 ms avg / max 827.1 ms / 55.61% avg of total recovery
Backup.proceed: 21.1 ms avg / max 28.3 ms / 1.50% avg of total recovery
Verify checksum: 0.0 ms avg / max 0.0 ms / 0.00% avg of total recovery
Segment append: 464.6 ms avg / max 487.4 ms / 32.96% avg of total recovery
Segment append copy: 49.6 ms avg / max 55.2 ms / 3.52% avg of total recovery
Segment append checksum: 224.4 ms avg / max 235.0 ms / 15.92% avg of total recovery
Other (HT, etc.): 298.3 ms avg / max 317.9 ms / 21.16% avg of total recovery (other)
Final log sync: 13.0 ms avg / max 15.4 ms / 0.92% avg of total recovery
Removing tombstones: 0.0 ms avg / max 0.0 ms / 0.00% avg of total recovery
Other: 38.9 ms avg / max 53.2 ms / 2.76% avg of total recovery
Receiving in transport: 14.0 ms avg / max 24.8 ms / 0.99% avg of total recovery
Transmitting in transport: 152.7 ms avg / max 164.8 ms / 10.83% avg of total recovery
Replicating one segment: 1.6 ms avg / max 1.7 ms
During replay: 1.6 ms avg / max 1.7 ms
During log sync: 1.1 ms avg / max 1.3 ms
RPC latency replicating one segment: 2.7 ms avg / max 2.9 ms (for R-th replica)
During replay: 2.7 ms avg / max 2.9 ms (for R-th replica)
During log sync: 1.5 ms avg / max 1.9 ms (for R-th replica)
Replication: 1291.3 ms avg / max 1313.0 ms / 91.59% avg of total recovery
Client RPCs Active: 1363.4 ms avg / max 1370.1 ms / 96.71% avg of total recovery
Average GRD completion time: 6.3 ms avg / max 6.3 ms
Task iterations: 2984825 avg / max 3355883
=== Backup Time ===
RPC service time: 379.0 ms avg / max 545.4 ms / 26.88% avg of total recovery
startReadingData: 0.2 ms avg / max 0.2 ms / 0.01% avg of total recovery
Open/write segment: 369.5 ms avg / max 535.5 ms / 26.21% avg of total recovery
Open segment memset: 0.1 ms avg / max 0.1 ms / 0.00% avg of total recovery
Copy: 367.8 ms avg / max 532.5 ms / 26.09% avg of total recovery
Other: 1.7 ms avg / max 2.8 ms / 0.12% avg of total recovery
getRecoveryData: 6.1 ms avg / max 27.4 ms / 0.43% avg of total recovery
Other: 3.2 ms avg / max 7.9 ms / 0.23% avg of total recovery
Transmitting in transport: 130.5 ms avg / max 164.8 ms / 9.25% avg of total recovery
Filtering segments: 248.6 ms avg / max 316.2 ms / 17.64% avg of total recovery
Reading segments: 1310.0 ms avg / max 1372.7 ms / 92.92% avg of total recovery
Using disk: 1304.0 ms avg / max 1370.3 ms / 92.49% avg of total recovery
getRecoveryData completions: 376 avg / max 390
getRecoveryData retry fraction: 0.365 avg / max 0.898
=== Efficiency ===
recoverSegment CPU: 1.04 ms avg (per filtered segment)
Writing a segment: 0.73 ms avg (backup RPC thread)
Filtering a segment: 6.63 ms avg
Memory bandwidth (backup copies): 2.51 GB/s avg / min 1.85 GB/s
=== Network Utilization ===
Aggregate: 249.47 Gb/s / 49.89% of network capacity (overall)
Master in: 8.50 Gb/s avg / min 7.98 Gb/s / 84.95 Gb/s total (overall)
Master out: 11.63 Gb/s avg / min 11.62 Gb/s / 116.29 Gb/s total (overall)
Master out during replication: 12.70 Gb/s avg / min 12.47 Gb/s / 127.00 Gb/s total (overall)
Master out during log sync: 15.15 Gb/s avg / min 12.29 Gb/s / 151.52 Gb/s total (overall)
Backup in: 6.65 Gb/s avg / min 4.39 Gb/s / 133.01 Gb/s total (overall)
Backup out: 6.65 Gb/s avg / min 1.64 Gb/s / 133.01 Gb/s total (overall)
=== Disk Utilization ===
Effective bandwidth: 226.41 MB/s avg / min 221.30 MB/s / 4528.14 MB/s total
Active bandwidth: 230.77 MB/s avg / min 222.62 MB/s / 4615.49 MB/s total
Reading: 230.25 MB/s avg / min 219.16 MB/s / 4604.97 MB/s total
Writing: 257.32 MB/s avg / min 181.80 MB/s / 4889.07 MB/s total
Disk active: 98.12% avg / max 99.54% (of total recovery)
Reading: 92.49% avg / max 97.20% (of total recovery)
Writing: 5.63% avg / max 10.77% (of total recovery)
=== Backup Events ===
Segments read: 37.5 avg / max 39.0
Primary segments loaded: 375.5 avg / max 390.0
Secondary segments loaded: 0.0 avg / max 0.0
=== Slowest Servers ===
Backup opens, writes: server.rc03 (366.2 ms)
Stalled reading segs from backups: server.rc03 (585.6 ms)
Reading from disk: server.rc02 (219.2 MB/s)
Writing to disk: server.rc11 (181.8 MB/s)
60 Masters, 120 Backups:
=== Summary ===
Recovery time: 1.624 s
Masters: 60
Backups: 120
Replicas: 3
Objects per master: 592415.0 avg / max 592415.0
Object size: 1024 bytes avg / max 1024 bytes
Total live objects: 35544900
Total recovery segment entries: 35544900
Total live object space: 34711.82 MB
Total recovery segment space w/ overhead: 35999.95 MB
Storage type: disk
Log directory: /home/rumble/ramcloud/recovery/20110830204624
=== Coordinator Time ===
Total: 1619.4 ms / 99.74% of total recovery
Starting recovery on backups: 35.4 ms / 2.18% of total recovery
Starting recovery on masters: 57.0 ms / 3.51% of total recovery
Tablets recovered: 9.8 ms / 0.60% of total recovery
Completing recovery on backups: 2.2 ms / 0.14% of total recovery
Set will: 0.0 ms / 0.00% of total recovery
Get tablet map: 442.9 ms / 27.28% of total recovery
Other: 1074.4 ms / 66.17% of total recovery
Receiving in transport: 4.5 ms / 0.28% of total recovery
=== Master Time ===
Total: 1494.6 ms avg / max 1557.2 ms / 92.05% avg of total recovery
Waiting for incoming segments: 850.4 ms avg / max 893.9 ms / 52.38% avg of total recovery
Inside recoverSegment: 596.6 ms avg / max 655.7 ms / 36.75% avg of total recovery
Backup.proceed: 6.8 ms avg / max 10.4 ms / 0.42% avg of total recovery
Verify checksum: 0.0 ms avg / max 0.0 ms / 0.00% avg of total recovery
Segment append: 389.0 ms avg / max 436.7 ms / 23.96% avg of total recovery
Segment append copy: 57.6 ms avg / max 68.5 ms / 3.54% avg of total recovery
Segment append checksum: 183.0 ms avg / max 221.6 ms / 11.27% avg of total recovery
Other (HT, etc.): 200.8 ms avg / max 218.6 ms / 12.36% avg of total recovery (other)
Final log sync: 22.1 ms avg / max 37.1 ms / 1.36% avg of total recovery
Removing tombstones: 0.0 ms avg / max 0.1 ms / 0.00% avg of total recovery
Other: 25.4 ms avg / max 65.2 ms / 1.57% avg of total recovery
Receiving in transport: 16.4 ms avg / max 21.7 ms / 1.01% avg of total recovery
Transmitting in transport: 115.4 ms avg / max 138.6 ms / 7.11% avg of total recovery
Replicating one segment: 1.9 ms avg / max 2.0 ms
During replay: 1.9 ms avg / max 2.0 ms
During log sync: 1.4 ms avg / max 2.1 ms
RPC latency replicating one segment: 3.9 ms avg / max 4.4 ms (for R-th replica)
During replay: 3.9 ms avg / max 4.5 ms (for R-th replica)
During log sync: 2.5 ms avg / max 4.6 ms (for R-th replica)
Replication: 1494.0 ms avg / max 1536.1 ms / 92.02% avg of total recovery
Client RPCs Active: 1508.1 ms avg / max 1545.9 ms / 92.88% avg of total recovery
Average GRD completion time: 1.2 ms avg / max 1.2 ms
Task iterations: 2965835 avg / max 3258522
=== Backup Time ===
RPC service time: 309.5 ms avg / max 398.2 ms / 19.06% avg of total recovery
startReadingData: 0.6 ms avg / max 0.9 ms / 0.04% avg of total recovery
Open/write segment: 300.9 ms avg / max 389.7 ms / 18.54% avg of total recovery
Open segment memset: 0.1 ms avg / max 1.1 ms / 0.00% avg of total recovery
Copy: 299.1 ms avg / max 387.4 ms / 18.42% avg of total recovery
Other: 1.8 ms avg / max 8.1 ms / 0.11% avg of total recovery
getRecoveryData: 5.3 ms avg / max 10.5 ms / 0.33% avg of total recovery
Other: 2.6 ms avg / max 6.4 ms / 0.16% avg of total recovery
Transmitting in transport: 112.2 ms avg / max 138.6 ms / 6.91% avg of total recovery
Filtering segments: 325.9 ms avg / max 369.2 ms / 20.07% avg of total recovery
Reading segments: 1121.6 ms avg / max 1151.5 ms / 69.08% avg of total recovery
Using disk: 1118.3 ms avg / max 1133.8 ms / 68.88% avg of total recovery
getRecoveryData completions: 2256 avg / max 2280
getRecoveryData retry fraction: 0.001 avg / max 0.079
=== Efficiency ===
recoverSegment CPU: 0.13 ms avg (per filtered segment)
Writing a segment: 0.59 ms avg (backup RPC thread)
Filtering a segment: 8.67 ms avg
=== Network Utilization ===
Aggregate: 1300.67 Gb/s / 33.87% of network capacity (overall)
Master in: 7.24 Gb/s avg / min 6.43 Gb/s / 434.55 Gb/s total (overall)
Master out: 10.11 Gb/s avg / min 10.08 Gb/s / 606.39 Gb/s total (overall)
Master out during replication: 10.96 Gb/s avg / min 10.66 Gb/s / 657.46 Gb/s total (overall)
Master out during log sync: 14.22 Gb/s avg / min 9.31 Gb/s / 852.95 Gb/s total (overall)
Backup in: 5.78 Gb/s avg / min 3.54 Gb/s / 693.24 Gb/s total (overall)
Backup out: 5.78 Gb/s avg / min 1.42 Gb/s / 693.01 Gb/s total (overall)
Master Infiniband TX Active: 1.5 s avg / min 1.4 s / 89.0 s total
=== Disk Utilization ===
Effective bandwidth: 237.33 MB/s avg / min 187.23 MB/s / 28479.37 MB/s total
Active bandwidth: 246.69 MB/s avg / min 237.17 MB/s / 29602.49 MB/s total
Reading: 268.85 MB/s avg / min 263.45 MB/s / 32262.50 MB/s total
Writing: 191.45 MB/s avg / min 165.05 MB/s / 22400.18 MB/s total
Disk active: 96.35% avg / max 99.30% (of total recovery)
Reading: 68.88% avg / max 69.83% (of total recovery)
Writing: 27.48% avg / max 31.45% (of total recovery)
=== Backup Events ===
Segments read: 37.6 avg / max 38.0
Primary segments loaded: 2255.5 avg / max 2280.0
Secondary segments loaded: 0.0 avg / max 0.0
=== Slowest Servers ===
Backup opens, writes: server.rc48 (537.5 ms)
Stalled reading segs from backups: server.rc59 (893.9 ms)
Reading from disk: server.rc21 (263.5 MB/s)
Writing to disk: backup.rc38 (165.1 MB/s)