Page Comparison

...

The benchmarks below have been executed using separate machines (out of the Stanford RAMCloud cluster) for client and server which are connected via Infiniband. After each run, the equality of the client-side and server-side calculated sum has been checked. During all runs, the hash table size was set to 5GB. This particular benchmark allows

Benchmarks over all stored objects (low selectivity)

In this set of benchmarks, all objects which are stored in a MasterServer are included in the aggregation operation. Consequently, this means that if 1.000.000 objects are aggregated, there are only 1.000.000 objects stored within a MasterServer.

#number of objects	client-side aggregation	server-side aggregation via hash table lookup	server-side aggregation via hash table forEach	server-side aggregation via Log traversal
10.000	63 ms	1 ms	74 ms	6 ms
100.000	648 ms	12 ms	84 ms	9 ms
1.000.000	6485 ms	127 ms	168 ms	21 ms
10.000.000	64258 ms	1378 ms	781 ms	142 ms
100.000.000	652201 ms	19854 ms	6245 ms	1422 ms

Image Added

Benchmarks over a subset of stored objects (high selectivity)

In this set of benchmarks, only a subset of 10% of the objects which are stored in a MasterServer are included in the aggregation operation. Consequently, this means that of 1.000.000 objects are aggregated, there are 10.000.000 objects stored in total within a MasterServer.

Conclusions

This previous benchmarks allow the following conclusions:

By executing the aggregation on the server-side a performance improvement up to a factor 100x can be seen if one is using the hash table for the iteration. If one is directly traversing the Log, a performance improvement of up to a factor 450x can be seen (although it is questionable if a Log traversal would be appropriate for executing server-side data operations).
When traversing a set of distinct objects, retrieving a single object takes about 7-8?s (or a RAMCloud client can request about 130.000 objects/sec from a single RAMCloud server).
When invoking the hashTable forEach method the whole allocated memory for the hashtable has to be traversed. This is fine if the hashtable is densely packed with objects. In case of a sparse population with objects this introduces a penalty.

...

#number of objects

...

client-side aggregation

...

server-side aggregation
via hash table lookup

...

server-side aggregation
via hash table forEach

...

server-side aggregation
via Log traversal

...

10.000

...

63 ms

...

1 ms

...

74 ms

...

6 ms

...

100.000

...

648 ms

...

12 ms

...

84 ms

...

9 ms

...

1.000.000

...

6485 ms

...

127 ms

...

168 ms

...

21 ms

...

.

...

64258 ms

...

1378 ms

...

781 ms

...

142 ms

...

100.000.000

...

652201 ms

...

19854 ms

...

6245 ms

...

1422 ms

...

Versions Compared

Old Version 58

New Version 59

Key

Benchmarks over all stored objects (low selectivity)

Benchmarks over a subset of stored objects (high selectivity)

Conclusions