/
Benchmarking Ideas

Benchmarking Ideas

New idea: TOCS-style (done by Henry) timeline for indexed operations

 

  • End-to-end latency measurements
    • End-to-end write latency with x number of indexes, each index entry of size y.
      • Vary from x = 0 to n: Fill up index with a bunch of entries. Then measure the time to write one object. Delete that one. Again, measure time to write that one. Repeat to get mean/variance/min/max; keep y fixed (say 30 B) – basically, write perf as a func of num of indexes to be written.
      • For a given x: Latency as function of index size (as more and more objects are written and indexes get more filled up, increasing time for lookup); keep y fixed (say 30 B)
      • Vary y.
    • End-to-end remove latency: Details same as write.
    • End-to-end lookup+indexedRead latency
      • Details same as write.
      • Varying range of lookup such that different number of objects match.
  • Throughput / bandwidth measurements: Varying number of clients c (0 to n).
    • Writes or reads or removes or mix
    • Same index server vs different index servers
    • One indexlet vs multiple different indexlets
    • One table vs multiple tables
  • Scalability
    • How performance varies as index gets larger and spread across multiple nodes. We can do static partitioning at creation for checking.
  • Nanobenchmarks: (later)
    • Time to insert an entry for:
      • varying current number of nodes in the indexlet tree (mostly a sanity check; slope same, intercept different from end-to-end).
      • varying entry size
      • repeated index entry vs different index entry
    • Time to lookup an entry: Details same as insert.
      • vary range
    • Time to remove an entry: Details same as insert.
    • Break down lookup+indexedRead from end-to-end into the two components
    • Relation between tree fanout and read/write perf
  • Memory footprint
    • We don't want to measure deletion / cleaning – can use from lsm paper. We want to see the overhead for an index. Create a large index, then measure space, divide by num of entries, calculate overhead.
    • Can compare malloc version and ramcloud object alloc version.
  • Recovery times
    • Verify that its the same as recovering table of that size to one recovery master.
  • Compare with other systems
    • Systems:
      • mySQL
      • H-base?
      • Cassandra?
      • Come up with more
    • Benchmarks:
      • See if there are standard benchmarks (or something that one of the above do) that measure indexing perf.
      • Create table, write n million indexed objects with x indexes.
      • Do above, then measure random indexed reads.

Related content

SLIK Tasks
More like this
clusterperf benchmarks
clusterperf benchmarks
More like this
How To Measure Performance
How To Measure Performance
More like this
Perf benchmarks
Perf benchmarks
More like this
Measuring RAMCloud Performance
Measuring RAMCloud Performance
More like this
SLIK Notes
More like this