Metrics Files

Here are some representative metrics files to consult and compare against as the system evolves.

SOSP '11 Paper Experiments

Note that all data files (full logs, not just metrics) are in the paper repo.

Figure 8 (Proportional Scaling, 1 to 20 masters, 6 backup processes per master)

Note that the paper results were averages of 5 runs.

1 Master, 6 backups:

=== Summary ===
Recovery time:                             0.985 s
Masters:                                       1
Backups:                                       6
Replicas:                                      3
Objects per master:                       592415
Object size:                                1024 bytes
Total live objects:                       592415
Total recovery segment entries:           592415
Total live object space:                  578.53 MB
Total recovery segment space w/ overhead: 600.00 MB
Storage type:                             disk
Log directory:                            /home/rumble/ramcloud/recovery/20110826170432

=== Coordinator Time ===
Total:                               982.4 ms / 99.75% of total recovery
  Starting recovery on backups:        0.7 ms /  0.07% of total recovery
  Starting recovery on masters:        1.8 ms /  0.18% of total recovery
  Tablets recovered:                   0.3 ms /  0.03% of total recovery
    Completing recovery on backups:    0.1 ms /  0.01% of total recovery
  Set will:                            0.0 ms /  0.00% of total recovery
  Get tablet map:                     19.9 ms /  2.02% of total recovery
  Other:                             959.8 ms / 97.45% of total recovery
Receiving in transport:               13.2 ms /  1.34% of total recovery

=== Master Time ===
Total:                                  0.0 ms /  0.00% of total recovery
Waiting for incoming segments:        457.4 ms / 46.44% of total recovery
Inside recoverSegment:                504.7 ms / 51.24% of total recovery
  Backup.proceed:                       8.5 ms /  0.86% of total recovery
  Verify checksum:                      0.0 ms /  0.00% of total recovery
  Segment append:                     322.2 ms / 32.72% of total recovery
    Segment append copy:               51.4 ms /  5.21% of total recovery
    Segment append checksum:          147.3 ms / 14.96% of total recovery
  Other (HT, etc.):                   174.0 ms / 17.66% of total recovery  (other)
Final log sync:                         9.7 ms /  0.98% of total recovery
Removing tombstones:                    0.0 ms /  0.00% of total recovery
Other:                               -971.8 ms / -98.67% of total recovery
Receiving in transport:                 3.7 ms /  0.38% of total recovery
Transmitting in transport:             29.2 ms /  2.97% of total recovery
Replicating one segment:                1.2 ms
  During replay:                        1.2 ms
  During log sync:                      0.8 ms
RPC latency replicating one segment:    2.1 ms  (for R-th replica)
  During replay:                        2.1 ms  (for R-th replica)
  During log sync:                      1.2 ms  (for R-th replica)
Replication:                          847.2 ms / 86.01% of total recovery
Client RPCs Active:                   873.7 ms / 88.71% of total recovery
Average GRD completion time:           42.0 ms
Task iterations:                     2789253

=== Backup Time ===
RPC service time:                 87.7 ms avg / max  111.3 ms /  8.90% avg of total recovery
  startReadingData:                0.1 ms avg / max    0.1 ms /  0.01% avg of total recovery
  Open/write segment:             82.5 ms avg / max  105.9 ms /  8.38% avg of total recovery
    Open segment memset:           0.0 ms avg / max    0.0 ms /  0.00% avg of total recovery
    Copy:                         82.1 ms avg / max  105.3 ms /  8.34% avg of total recovery
    Other:                         0.4 ms avg / max    0.5 ms /  0.04% avg of total recovery
  getRecoveryData:                 2.3 ms avg / max    4.4 ms /  0.23% avg of total recovery
  Other:                           2.8 ms avg / max    3.2 ms /  0.29% avg of total recovery
Transmitting in transport:        24.1 ms avg / max   29.2 ms /  2.45% avg of total recovery
Filtering segments:               54.0 ms avg / max   60.7 ms /  5.48% avg of total recovery
Reading segments:                866.8 ms avg / max  957.4 ms / 88.01% avg of total recovery
  Using disk:                    864.0 ms avg / max  954.7 ms / 87.72% avg of total recovery
getRecoveryData completions:    13 avg / max 14
getRecoveryData retry fraction: 0.798 avg / max 0.959

=== Efficiency ===
recoverSegment CPU:    6.64 ms avg  (per filtered segment)
Writing a segment:     0.49 ms avg  (backup RPC thread)
Filtering a segment:   4.32 ms avg

=== Network Utilization ===
Aggregate:                       34.15 Gb/s / 17.79% of network capacity  (overall)
Master in:                       7.42 Gb/s  (overall)
Master out:                      15.10 Gb/s  (overall)
  Master out during replication: 17.56 Gb/s  (overall)
  Master out during log sync:    19.49 Gb/s  (overall)
Backup in:                       3.17 Gb/s avg / min 2.03 Gb/s / 19.04 Gb/s total  (overall)
Backup out:                      3.17 Gb/s avg / min 0.70 Gb/s / 19.04 Gb/s total  (overall)

=== Disk Utilization ===
Effective bandwidth: 116.42 MB/s avg / min 105.59 MB/s / 698.54 MB/s total
Active bandwidth:    122.46 MB/s avg / min 113.82 MB/s / 734.74 MB/s total
  Reading:           115.51 MB/s avg / min 107.07 MB/s / 693.07 MB/s total
  Writing:           169.53 MB/s avg / min 141.75 MB/s / 678.11 MB/s total
Disk active:          95.13% avg / max  98.87%  (of total recovery)
  Reading:            87.72% avg / max  96.93%  (of total recovery)
  Writing:             7.41% avg / max  17.19%  (of total recovery)

=== Backup Events ===
Segments read:               12.5 avg / max   14.0
Primary segments loaded:     12.7 avg / max   14.0
Secondary segments loaded:    0.0 avg / max    0.0

=== Slowest Servers ===
Backup opens, writes:              server.rc01 (284.5 ms)
Stalled reading segs from backups: server.rc01 (457.4 ms)
Reading from disk:                 backup.rc03 (107.1 MB/s)
Writing to disk:                   backup.rc01 (141.7 MB/s)

20 Masters, 120 Backups:

=== Summary ===
Recovery time:                             1.118 s
Masters:                                      20
Backups:                                     120
Replicas:                                      3
Objects per master:                       592415.0 avg / max 592415.0
Object size:                                1024 bytes avg / max   1024 bytes
Total live objects:                       11848300
Total recovery segment entries:           11848300
Total live object space:                  11570.61 MB
Total recovery segment space w/ overhead: 11999.98 MB
Storage type:                             disk
Log directory:                            /home/rumble/ramcloud/recovery/20110826174510

=== Coordinator Time ===
Total:                              1115.9 ms / 99.80% of total recovery
  Starting recovery on backups:       11.7 ms /  1.05% of total recovery
  Starting recovery on masters:        7.0 ms /  0.63% of total recovery
  Tablets recovered:                   2.3 ms /  0.21% of total recovery
    Completing recovery on backups:    0.8 ms /  0.07% of total recovery
  Set will:                            0.0 ms /  0.00% of total recovery
  Get tablet map:                    145.3 ms / 13.00% of total recovery
  Other:                             949.6 ms / 84.92% of total recovery
Receiving in transport:                6.4 ms /  0.57% of total recovery

=== Master Time ===
Total:                               1041.1 ms avg / max 1101.7 ms / 93.11% avg of total recovery
Waiting for incoming segments:        531.6 ms avg / max  552.4 ms / 47.54% avg of total recovery
Inside recoverSegment:                522.1 ms avg / max  540.8 ms / 46.69% avg of total recovery
  Backup.proceed:                       7.9 ms avg / max    8.9 ms /  0.70% avg of total recovery
  Verify checksum:                      0.0 ms avg / max    0.0 ms /  0.00% avg of total recovery
  Segment append:                     333.8 ms avg / max  349.2 ms / 29.85% avg of total recovery
    Segment append copy:               51.3 ms avg / max   56.9 ms /  4.59% avg of total recovery
    Segment append checksum:          153.6 ms avg / max  166.0 ms / 13.74% avg of total recovery
  Other (HT, etc.):                   180.4 ms avg / max  185.7 ms / 16.13% avg of total recovery  (other)
Final log sync:                        12.6 ms avg / max   16.7 ms /  1.12% avg of total recovery
Removing tombstones:                    0.0 ms avg / max    0.0 ms /  0.00% avg of total recovery
Other:                                -25.2 ms avg / max   39.4 ms / -2.25% avg of total recovery
Receiving in transport:                 6.5 ms avg / max   10.9 ms /  0.58% avg of total recovery
Transmitting in transport:             38.2 ms avg / max   42.2 ms /  3.41% avg of total recovery
Replicating one segment:                1.3 ms avg / max    1.3 ms
  During replay:                        1.3 ms avg / max    1.3 ms
  During log sync:                      1.1 ms avg / max    1.4 ms
RPC latency replicating one segment:    2.7 ms avg / max    2.8 ms  (for R-th replica)
  During replay:                        2.7 ms avg / max    2.8 ms  (for R-th replica)
  During log sync:                      1.5 ms avg / max    2.1 ms  (for R-th replica)
Replication:                          942.2 ms avg / max  955.5 ms / 84.26% avg of total recovery
Client RPCs Active:                   985.7 ms avg / max  998.0 ms / 88.15% avg of total recovery
Average GRD completion time:            2.4 ms avg / max    2.4 ms
Task iterations:                     3193584 avg / max 3519448

=== Backup Time ===
RPC service time:                 80.0 ms avg / max  263.2 ms /  7.15% avg of total recovery
  startReadingData:                0.1 ms avg / max    0.2 ms /  0.01% avg of total recovery
  Open/write segment:             75.1 ms avg / max  131.8 ms /  6.72% avg of total recovery
    Open segment memset:           0.0 ms avg / max    0.0 ms /  0.00% avg of total recovery
    Copy:                         74.7 ms avg / max  131.1 ms /  6.68% avg of total recovery
    Other:                         0.4 ms avg / max    0.7 ms /  0.04% avg of total recovery
  getRecoveryData:                 2.2 ms avg / max  158.2 ms /  0.20% avg of total recovery
  Other:                           2.6 ms avg / max   43.2 ms /  0.23% avg of total recovery
Transmitting in transport:        27.7 ms avg / max   42.2 ms /  2.48% avg of total recovery
Filtering segments:               68.5 ms avg / max   91.0 ms /  6.13% avg of total recovery
Reading segments:                845.2 ms avg / max 1087.4 ms / 75.59% avg of total recovery
  Using disk:                    842.1 ms avg / max 1083.0 ms / 75.31% avg of total recovery
getRecoveryData completions:    250 avg / max 300
getRecoveryData retry fraction: 0.063 avg / max 0.992

=== Efficiency ===
recoverSegment CPU:    0.35 ms avg  (per filtered segment)
Writing a segment:     0.44 ms avg  (backup RPC thread)
Filtering a segment:   5.48 ms avg

=== Network Utilization ===
Aggregate:                       601.92 Gb/s / 15.67% of network capacity  (overall)
Master in:                       6.47 Gb/s avg / min 5.20 Gb/s / 129.34 Gb/s total  (overall)
Master out:                      13.30 Gb/s avg / min 13.08 Gb/s / 265.98 Gb/s total  (overall)
  Master out during replication: 15.78 Gb/s avg / min 15.57 Gb/s / 315.70 Gb/s total  (overall)
  Master out during log sync:    15.83 Gb/s avg / min 11.34 Gb/s / 316.68 Gb/s total  (overall)
Backup in:                       2.80 Gb/s avg / min 1.29 Gb/s / 335.45 Gb/s total  (overall)
Backup out:                      2.80 Gb/s avg / min 0.56 Gb/s / 335.42 Gb/s total  (overall)

=== Disk Utilization ===
Effective bandwidth: 116.92 MB/s avg / min  78.70 MB/s / 14030.17 MB/s total
Active bandwidth:    122.00 MB/s avg / min  81.26 MB/s / 14640.46 MB/s total
  Reading:           118.93 MB/s avg / min  73.87 MB/s / 14271.83 MB/s total
  Writing:           135.00 MB/s avg / min  83.26 MB/s / 16064.48 MB/s total
Disk active:          95.84% avg / max  99.50%  (of total recovery)
  Reading:            75.31% avg / max  96.85%  (of total recovery)
  Writing:            20.53% avg / max  30.30%  (of total recovery)

=== Backup Events ===
Segments read:               12.5 avg / max   15.0
Primary segments loaded:    250.3 avg / max  300.0
Secondary segments loaded:    0.0 avg / max    0.0

=== Slowest Servers ===
Backup opens, writes:              server.rc09 (351.9 ms)
Stalled reading segs from backups: server.rc09 (552.4 ms)
Reading from disk:                 backup.rc22 (73.9 MB/s)
Writing to disk:                   backup.rc14 (83.3 MB/s)

Figure 9 (Proportional Scaling, 2 to 60 masters, 2 backup processes per master)

After rerunning a few of the data points from this graph it looks like we (unfortunately) scaled the cluster by adding processes on two machines at a time before starting to double up master/backup processes with backup processes on the same machine. Doubling up processes right away doesn't yield the same results as the paper even with the code from SOSP'11.

2 Masters, 4 Backups:

=== Summary ===
Recovery time:                             1.290 s
Masters:                                       2
Backups:                                       4
Replicas:                                      3
Objects per master:                       592415.0 avg / max 592415.0
Object size:                                1024 bytes avg / max   1024 bytes
Total live objects:                       1184830
Total recovery segment entries:           1184830
Total live object space:                  1157.06 MB
Total recovery segment space w/ overhead: 1200.00 MB
Storage type:                             disk
Log directory:                            /home/rumble/ramcloud/recovery/20110830182651

=== Coordinator Time ===
Total:                              1287.5 ms / 99.78% of total recovery
  Starting recovery on backups:        3.5 ms /  0.27% of total recovery
  Starting recovery on masters:        0.5 ms /  0.04% of total recovery
  Tablets recovered:                   0.3 ms /  0.03% of total recovery
    Completing recovery on backups:    0.1 ms /  0.01% of total recovery
  Set will:                            0.0 ms /  0.00% of total recovery
  Get tablet map:                     26.1 ms /  2.02% of total recovery
  Other:                            1257.0 ms / 97.42% of total recovery
Receiving in transport:               14.7 ms /  1.14% of total recovery

=== Master Time ===
Total:                                639.0 ms avg / max 1278.0 ms / 49.52% avg of total recovery
Waiting for incoming segments:        558.8 ms avg / max  572.9 ms / 43.31% avg of total recovery
Inside recoverSegment:                633.2 ms avg / max  647.1 ms / 49.07% avg of total recovery
  Backup.proceed:                      13.2 ms avg / max   13.6 ms /  1.02% avg of total recovery
  Verify checksum:                      0.0 ms avg / max    0.0 ms /  0.00% avg of total recovery
  Segment append:                     412.3 ms avg / max  419.6 ms / 31.95% avg of total recovery
    Segment append copy:               56.9 ms avg / max   59.4 ms /  4.41% avg of total recovery
    Segment append checksum:          189.3 ms avg / max  194.3 ms / 14.67% avg of total recovery
  Other (HT, etc.):                   207.7 ms avg / max  213.8 ms / 16.10% avg of total recovery  (other)
Final log sync:                        77.3 ms avg / max   80.8 ms /  5.99% avg of total recovery
Removing tombstones:                    0.0 ms avg / max    0.0 ms /  0.00% avg of total recovery
Other:                               -630.3 ms avg / max   12.1 ms / -48.85% avg of total recovery
Receiving in transport:                 8.3 ms avg / max    8.5 ms /  0.64% avg of total recovery
Transmitting in transport:            123.8 ms avg / max  126.3 ms /  9.59% avg of total recovery
Replicating one segment:                1.6 ms avg / max    1.6 ms
  During replay:                        1.6 ms avg / max    1.6 ms
  During log sync:                      1.2 ms avg / max    1.2 ms
RPC latency replicating one segment:    3.4 ms avg / max    3.4 ms  (for R-th replica)
  During replay:                        3.5 ms avg / max    3.5 ms  (for R-th replica)
  During log sync:                      2.4 ms avg / max    2.5 ms  (for R-th replica)
Replication:                         1227.0 ms avg / max 1230.7 ms / 95.09% avg of total recovery
Client RPCs Active:                  1249.5 ms avg / max 1251.7 ms / 96.84% avg of total recovery
Average GRD completion time:           26.7 ms avg / max   26.8 ms
Task iterations:                     1950173 avg / max 1971082

=== Backup Time ===
RPC service time:                407.2 ms avg / max  440.4 ms / 31.56% avg of total recovery
  startReadingData:                0.1 ms avg / max    0.1 ms /  0.01% avg of total recovery
  Open/write segment:            397.9 ms avg / max  433.3 ms / 30.84% avg of total recovery
    Open segment memset:           0.1 ms avg / max    0.1 ms /  0.00% avg of total recovery
    Copy:                        395.8 ms avg / max  431.2 ms / 30.68% avg of total recovery
    Other:                         2.0 ms avg / max    2.1 ms /  0.16% avg of total recovery
  getRecoveryData:                 6.0 ms avg / max    7.8 ms /  0.46% avg of total recovery
  Other:                           3.3 ms avg / max    3.5 ms /  0.26% avg of total recovery
Transmitting in transport:       119.2 ms avg / max  126.3 ms /  9.24% avg of total recovery
Filtering segments:              345.2 ms avg / max  370.2 ms / 26.75% avg of total recovery
Reading segments:               1131.8 ms avg / max 1182.3 ms / 87.72% avg of total recovery
  Using disk:                   1116.9 ms avg / max 1132.1 ms / 86.56% avg of total recovery
getRecoveryData completions:    76 avg / max 78
getRecoveryData retry fraction: 0.774 avg / max 0.823

=== Efficiency ===
recoverSegment CPU:    4.19 ms avg  (per filtered segment)
Writing a segment:     0.78 ms avg  (backup RPC thread)
Filtering a segment:   9.21 ms avg

=== Network Utilization ===
Aggregate:                       54.51 Gb/s / 42.58% of network capacity  (overall)
Master in:                       9.03 Gb/s avg / min 8.82 Gb/s / 18.07 Gb/s total  (overall)
Master out:                      12.72 Gb/s avg / min 12.69 Gb/s / 25.43 Gb/s total  (overall)
  Master out during replication: 13.32 Gb/s avg / min 13.31 Gb/s / 26.64 Gb/s total  (overall)
  Master out during log sync:    16.86 Gb/s avg / min 16.54 Gb/s / 33.72 Gb/s total  (overall)
Backup in:                       7.27 Gb/s avg / min 5.38 Gb/s / 29.07 Gb/s total  (overall)
Backup out:                      7.27 Gb/s avg / min 1.79 Gb/s / 29.07 Gb/s total  (overall)
Master Infiniband TX Active:      1.2 s avg / min  1.2 s /  2.4 s total

=== Disk Utilization ===
Effective bandwidth: 257.30 MB/s avg / min 254.20 MB/s / 1029.20 MB/s total
Active bandwidth:    261.67 MB/s avg / min 259.48 MB/s / 1046.66 MB/s total
  Reading:           268.59 MB/s avg / min 268.36 MB/s / 1074.38 MB/s total
  Writing:           210.99 MB/s avg / min 198.68 MB/s / 843.94 MB/s total
Disk active:          98.33% avg / max  99.47%  (of total recovery)
  Reading:            86.56% avg / max  87.73%  (of total recovery)
  Writing:            11.77% avg / max  12.48%  (of total recovery)

=== Backup Events ===
Segments read:               37.5 avg / max   38.0
Primary segments loaded:     75.5 avg / max   78.0
Secondary segments loaded:    0.0 avg / max    0.0

=== Slowest Servers ===
Backup opens, writes:              server.rc02 (362.5 ms)
Stalled reading segs from backups: server.rc02 (572.9 ms)
Reading from disk:                 backup.rc02 (268.4 MB/s)
Writing to disk:                   backup.rc02 (198.7 MB/s)

10 Masters, 20 Backups:

=== Summary ===
Recovery time:                             1.410 s
Masters:                                      10
Backups:                                      20
Total nodes:                                  20
Replicas:                                      3
Objects per master:                       592415.0 avg / max 592415.0
Object size:                                1024 bytes avg / max   1024 bytes
Total live objects:                       5924150
Total recovery segment entries:           5924150
Total live object space:                  5785.30 MB
Total recovery segment space w/ overhead: 5999.99 MB
Storage type:                             disk
Log directory:                            /home/stutsman/src/ramcloud-sosp/recovery/20121018083553

=== Coordinator Time ===
Total:                              1407.7 ms / 99.85% of total recovery
  Starting recovery on backups:        4.5 ms /  0.32% of total recovery
  Starting recovery on masters:        2.2 ms /  0.16% of total recovery
  Tablets recovered:                   1.0 ms /  0.07% of total recovery
    Completing recovery on backups:    0.2 ms /  0.01% of total recovery
  Set will:                            0.0 ms /  0.00% of total recovery
  Get tablet map:                     75.4 ms /  5.35% of total recovery
  Other:                            1324.7 ms / 93.96% of total recovery
Receiving in transport:                8.9 ms /  0.63% of total recovery

=== Master Time ===
Total:                               1398.7 ms avg / max 1404.1 ms / 99.21% avg of total recovery
Waiting for incoming segments:        562.7 ms avg / max  585.6 ms / 39.91% avg of total recovery
Inside recoverSegment:                784.0 ms avg / max  827.1 ms / 55.61% avg of total recovery
  Backup.proceed:                      21.1 ms avg / max   28.3 ms /  1.50% avg of total recovery
  Verify checksum:                      0.0 ms avg / max    0.0 ms /  0.00% avg of total recovery
  Segment append:                     464.6 ms avg / max  487.4 ms / 32.96% avg of total recovery
    Segment append copy:               49.6 ms avg / max   55.2 ms /  3.52% avg of total recovery
    Segment append checksum:          224.4 ms avg / max  235.0 ms / 15.92% avg of total recovery
  Other (HT, etc.):                   298.3 ms avg / max  317.9 ms / 21.16% avg of total recovery  (other)
Final log sync:                        13.0 ms avg / max   15.4 ms /  0.92% avg of total recovery
Removing tombstones:                    0.0 ms avg / max    0.0 ms /  0.00% avg of total recovery
Other:                                 38.9 ms avg / max   53.2 ms /  2.76% avg of total recovery
Receiving in transport:                14.0 ms avg / max   24.8 ms /  0.99% avg of total recovery
Transmitting in transport:            152.7 ms avg / max  164.8 ms / 10.83% avg of total recovery
Replicating one segment:                1.6 ms avg / max    1.7 ms
  During replay:                        1.6 ms avg / max    1.7 ms
  During log sync:                      1.1 ms avg / max    1.3 ms
RPC latency replicating one segment:    2.7 ms avg / max    2.9 ms  (for R-th replica)
  During replay:                        2.7 ms avg / max    2.9 ms  (for R-th replica)
  During log sync:                      1.5 ms avg / max    1.9 ms  (for R-th replica)
Replication:                         1291.3 ms avg / max 1313.0 ms / 91.59% avg of total recovery
Client RPCs Active:                  1363.4 ms avg / max 1370.1 ms / 96.71% avg of total recovery
Average GRD completion time:            6.3 ms avg / max    6.3 ms
Task iterations:                     2984825 avg / max 3355883

=== Backup Time ===
RPC service time:                379.0 ms avg / max  545.4 ms / 26.88% avg of total recovery
  startReadingData:                0.2 ms avg / max    0.2 ms /  0.01% avg of total recovery
  Open/write segment:            369.5 ms avg / max  535.5 ms / 26.21% avg of total recovery
    Open segment memset:           0.1 ms avg / max    0.1 ms /  0.00% avg of total recovery
    Copy:                        367.8 ms avg / max  532.5 ms / 26.09% avg of total recovery
    Other:                         1.7 ms avg / max    2.8 ms /  0.12% avg of total recovery
  getRecoveryData:                 6.1 ms avg / max   27.4 ms /  0.43% avg of total recovery
  Other:                           3.2 ms avg / max    7.9 ms /  0.23% avg of total recovery
Transmitting in transport:       130.5 ms avg / max  164.8 ms /  9.25% avg of total recovery
Filtering segments:              248.6 ms avg / max  316.2 ms / 17.64% avg of total recovery
Reading segments:               1310.0 ms avg / max 1372.7 ms / 92.92% avg of total recovery
  Using disk:                   1304.0 ms avg / max 1370.3 ms / 92.49% avg of total recovery
getRecoveryData completions:    376 avg / max 390
getRecoveryData retry fraction: 0.365 avg / max 0.898

=== Efficiency ===
recoverSegment CPU:                 1.04 ms avg  (per filtered segment)
Writing a segment:                  0.73 ms avg  (backup RPC thread)
Filtering a segment:                6.63 ms avg
Memory bandwidth (backup copies):   2.51 GB/s avg / min   1.85 GB/s

=== Network Utilization ===
Aggregate:                       249.47 Gb/s / 49.89% of network capacity  (overall)
Master in:                       8.50 Gb/s avg / min 7.98 Gb/s / 84.95 Gb/s total  (overall)
Master out:                      11.63 Gb/s avg / min 11.62 Gb/s / 116.29 Gb/s total  (overall)
  Master out during replication: 12.70 Gb/s avg / min 12.47 Gb/s / 127.00 Gb/s total  (overall)
  Master out during log sync:    15.15 Gb/s avg / min 12.29 Gb/s / 151.52 Gb/s total  (overall)
Backup in:                       6.65 Gb/s avg / min 4.39 Gb/s / 133.01 Gb/s total  (overall)
Backup out:                      6.65 Gb/s avg / min 1.64 Gb/s / 133.01 Gb/s total  (overall)

=== Disk Utilization ===
Effective bandwidth: 226.41 MB/s avg / min 221.30 MB/s / 4528.14 MB/s total
Active bandwidth:    230.77 MB/s avg / min 222.62 MB/s / 4615.49 MB/s total
  Reading:           230.25 MB/s avg / min 219.16 MB/s / 4604.97 MB/s total
  Writing:           257.32 MB/s avg / min 181.80 MB/s / 4889.07 MB/s total
Disk active:          98.12% avg / max  99.54%  (of total recovery)
  Reading:            92.49% avg / max  97.20%  (of total recovery)
  Writing:             5.63% avg / max  10.77%  (of total recovery)

=== Backup Events ===
Segments read:               37.5 avg / max   39.0
Primary segments loaded:    375.5 avg / max  390.0
Secondary segments loaded:    0.0 avg / max    0.0

=== Slowest Servers ===
Backup opens, writes:              server.rc03 (366.2 ms)
Stalled reading segs from backups: server.rc03 (585.6 ms)
Reading from disk:                 server.rc02 (219.2 MB/s)
Writing to disk:                   server.rc11 (181.8 MB/s)

60 Masters, 120 Backups:

=== Summary ===
Recovery time:                             1.624 s
Masters:                                      60
Backups:                                     120
Replicas:                                      3
Objects per master:                       592415.0 avg / max 592415.0
Object size:                                1024 bytes avg / max   1024 bytes
Total live objects:                       35544900
Total recovery segment entries:           35544900
Total live object space:                  34711.82 MB
Total recovery segment space w/ overhead: 35999.95 MB
Storage type:                             disk
Log directory:                            /home/rumble/ramcloud/recovery/20110830204624

=== Coordinator Time ===
Total:                              1619.4 ms / 99.74% of total recovery
  Starting recovery on backups:       35.4 ms /  2.18% of total recovery
  Starting recovery on masters:       57.0 ms /  3.51% of total recovery
  Tablets recovered:                   9.8 ms /  0.60% of total recovery
    Completing recovery on backups:    2.2 ms /  0.14% of total recovery
  Set will:                            0.0 ms /  0.00% of total recovery
  Get tablet map:                    442.9 ms / 27.28% of total recovery
  Other:                            1074.4 ms / 66.17% of total recovery
Receiving in transport:                4.5 ms /  0.28% of total recovery

=== Master Time ===
Total:                               1494.6 ms avg / max 1557.2 ms / 92.05% avg of total recovery
Waiting for incoming segments:        850.4 ms avg / max  893.9 ms / 52.38% avg of total recovery
Inside recoverSegment:                596.6 ms avg / max  655.7 ms / 36.75% avg of total recovery
  Backup.proceed:                       6.8 ms avg / max   10.4 ms /  0.42% avg of total recovery
  Verify checksum:                      0.0 ms avg / max    0.0 ms /  0.00% avg of total recovery
  Segment append:                     389.0 ms avg / max  436.7 ms / 23.96% avg of total recovery
    Segment append copy:               57.6 ms avg / max   68.5 ms /  3.54% avg of total recovery
    Segment append checksum:          183.0 ms avg / max  221.6 ms / 11.27% avg of total recovery
  Other (HT, etc.):                   200.8 ms avg / max  218.6 ms / 12.36% avg of total recovery  (other)
Final log sync:                        22.1 ms avg / max   37.1 ms /  1.36% avg of total recovery
Removing tombstones:                    0.0 ms avg / max    0.1 ms /  0.00% avg of total recovery
Other:                                 25.4 ms avg / max   65.2 ms /  1.57% avg of total recovery
Receiving in transport:                16.4 ms avg / max   21.7 ms /  1.01% avg of total recovery
Transmitting in transport:            115.4 ms avg / max  138.6 ms /  7.11% avg of total recovery
Replicating one segment:                1.9 ms avg / max    2.0 ms
  During replay:                        1.9 ms avg / max    2.0 ms
  During log sync:                      1.4 ms avg / max    2.1 ms
RPC latency replicating one segment:    3.9 ms avg / max    4.4 ms  (for R-th replica)
  During replay:                        3.9 ms avg / max    4.5 ms  (for R-th replica)
  During log sync:                      2.5 ms avg / max    4.6 ms  (for R-th replica)
Replication:                         1494.0 ms avg / max 1536.1 ms / 92.02% avg of total recovery
Client RPCs Active:                  1508.1 ms avg / max 1545.9 ms / 92.88% avg of total recovery
Average GRD completion time:            1.2 ms avg / max    1.2 ms
Task iterations:                     2965835 avg / max 3258522

=== Backup Time ===
RPC service time:                309.5 ms avg / max  398.2 ms / 19.06% avg of total recovery
  startReadingData:                0.6 ms avg / max    0.9 ms /  0.04% avg of total recovery
  Open/write segment:            300.9 ms avg / max  389.7 ms / 18.54% avg of total recovery
    Open segment memset:           0.1 ms avg / max    1.1 ms /  0.00% avg of total recovery
    Copy:                        299.1 ms avg / max  387.4 ms / 18.42% avg of total recovery
    Other:                         1.8 ms avg / max    8.1 ms /  0.11% avg of total recovery
  getRecoveryData:                 5.3 ms avg / max   10.5 ms /  0.33% avg of total recovery
  Other:                           2.6 ms avg / max    6.4 ms /  0.16% avg of total recovery
Transmitting in transport:       112.2 ms avg / max  138.6 ms /  6.91% avg of total recovery
Filtering segments:              325.9 ms avg / max  369.2 ms / 20.07% avg of total recovery
Reading segments:               1121.6 ms avg / max 1151.5 ms / 69.08% avg of total recovery
  Using disk:                   1118.3 ms avg / max 1133.8 ms / 68.88% avg of total recovery
getRecoveryData completions:    2256 avg / max 2280
getRecoveryData retry fraction: 0.001 avg / max 0.079

=== Efficiency ===
recoverSegment CPU:    0.13 ms avg  (per filtered segment)
Writing a segment:     0.59 ms avg  (backup RPC thread)
Filtering a segment:   8.67 ms avg

=== Network Utilization ===
Aggregate:                       1300.67 Gb/s / 33.87% of network capacity  (overall)
Master in:                       7.24 Gb/s avg / min 6.43 Gb/s / 434.55 Gb/s total  (overall)
Master out:                      10.11 Gb/s avg / min 10.08 Gb/s / 606.39 Gb/s total  (overall)
  Master out during replication: 10.96 Gb/s avg / min 10.66 Gb/s / 657.46 Gb/s total  (overall)
  Master out during log sync:    14.22 Gb/s avg / min 9.31 Gb/s / 852.95 Gb/s total  (overall)
Backup in:                       5.78 Gb/s avg / min 3.54 Gb/s / 693.24 Gb/s total  (overall)
Backup out:                      5.78 Gb/s avg / min 1.42 Gb/s / 693.01 Gb/s total  (overall)
Master Infiniband TX Active:      1.5 s avg / min  1.4 s / 89.0 s total

=== Disk Utilization ===
Effective bandwidth: 237.33 MB/s avg / min 187.23 MB/s / 28479.37 MB/s total
Active bandwidth:    246.69 MB/s avg / min 237.17 MB/s / 29602.49 MB/s total
  Reading:           268.85 MB/s avg / min 263.45 MB/s / 32262.50 MB/s total
  Writing:           191.45 MB/s avg / min 165.05 MB/s / 22400.18 MB/s total
Disk active:          96.35% avg / max  99.30%  (of total recovery)
  Reading:            68.88% avg / max  69.83%  (of total recovery)
  Writing:            27.48% avg / max  31.45%  (of total recovery)

=== Backup Events ===
Segments read:               37.6 avg / max   38.0
Primary segments loaded:   2255.5 avg / max 2280.0
Secondary segments loaded:    0.0 avg / max    0.0

=== Slowest Servers ===
Backup opens, writes:              server.rc48 (537.5 ms)
Stalled reading segs from backups: server.rc59 (893.9 ms)
Reading from disk:                 server.rc21 (263.5 MB/s)
Writing to disk:                   backup.rc38 (165.1 MB/s)