#number of objects	client-side aggregation	server-side aggregation via hash table lookup	server-side aggregation via hash table forEach	server-side aggregation via Log traversal
10.000	49 ms	1 ms	147 ms	34 ms
100.000	489 ms	27 ms	1330 ms	296 ms
1.000.000	4913 ms	274 ms	15698 ms	2901 ms

Image Added

selectivity	client-side aggregation	server-side aggregation via hash table lookup	server-side aggregation via hash table forEach	server-side aggregation via Log traversal
100%	206 objects/ms	5036 objects/ms	16012 objects/ms	70323 objects/ms
10%	203 objects/ms	4030 objects/ms	1322 objects/ms	6825 objects/ms
0.5%	203 objects/ms	3650 objects/ms	63 objects/ms	345 objects/ms

Image Added

This previous benchmarks allow the following conclusions:

When aggregating all objects stored in a MasterServer (low selectivity), an performance increase of 25x can be seen when aggregating via hash table lookups on the server-side and an increase of 75x can be seen when aggregating via hash table forEach iteration on the server-side. When neglecting the hash table structure and directly going over the Log, an increase of 340x can be seen.
When aggregating over a 10% subset of all objects stored in a MasterServer (high selectivity), an performance increase of 20x can be seen when aggregating via hash table lookups on the server-side and an increase of 6x can be seen when aggregating via hash table forEach iteration on the server-side. When neglecting the hash table structure and directly going over the Log, an increase of 33x can be seen when going over a total number of 10.000.000 objects.
Hash table lookups seem to be preferable over a forEach iteration when focusing on server-side aggregation via the hash table and having a high selectivity.
When traversing a set of distinct objects, retrieving a single object takes about 7-8?s (or a RAMCloud client can request about 130.000 objects/sec from a single RAMCloud server).
When invoking the hashTable forEach method the whole allocated memory for the hashtable has to be traversed. This is fine if the hashtable is densely packed with objects. In case of a sparse population with objects this introduces a penalty.

Disaggregation Operation

#number of objects	server-side aggregation via hash table lookup	server-side Disaggregation via hash table lookup with 0 backups
10.000	1 ms	4 ms
100.000	11 ms	50 ms
1.000.000	124 ms	515 ms
10.000.000	1413 ms	5411 ms

Versions Compared