Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 31 Next »

Purpose

The intention of this page is to present experiments with non-CRUD data operations.

Aggregation Operation

An aggregation operations adds up the values of a number of objects. When executing such an operation in RAMCloud three questions, among others, are of interest:

  1. Where to execute the aggregation operation (client or server side)?
  2. How to describe the range of objects which should be included in the operation?
  3. How to interpret the objects themselves?

The experiments below are centered around the question about where to execute the operation. Three different scenarios are implemented:

  • Client-side aggregation: The client-side aggregation is implemented in a way that client request a number of objects one by one where each object contains one integer value. Consequently, a "Read-RPC" gets invoked for every object and the client locally computes the sum.
  • Server-side aggregation via hashtable lookup: A range of keys is passed to the server and the server performs a lookup in its own hash table for every object. Again, each object contains a single integer which gets added up (as shown in Listing 1). Once the aggregation is done, the resulting sum is sent back to the server via RPC.
  • Server-side aggregation via hashTable forEach: The hash table in the MasterServer offers a forEach method that iterates over all object contained in the hash table. A callback can be registered to that method which is shown in Listing 2.
Listing 1: Aggregation via looking up a certain range of keys in MasterServer.cc
for(uint64_t i = 0; i < range; ++i)
   {
        LogEntryHandle handle = objectMap.lookup(tableId, i);
        const Object* obj = handle->userData<Object>();
        int *p;
        p = (int*) obj->data;
        sum += (uint64_t)*p;
   }
Listing 2: Aggregation using a callback in MasterServer.cc that gets invoked via objectMap.forEach()
/**
* Aggregation Callback
*/
void
aggregateCallback(LogEntryHandle handle, uint8_t type,
                      void *cookie)
{
        const Object* obj = handle->userData<Object>();
        MasterServer *server = reinterpret_cast<MasterServer*>(cookie);

        int *p;
        p = (int*) obj->data;
        server->sum += (uint64_t)*p;
}

Benchmarking

The benchmarks below have been executed using a separate server for client and server connected via Infiniband. After each run, the equality of the client-side and server-side calculated sum has been checked. The benchmarks allow the following conclusions:

  • By executing the aggregation on the server-side a performance improvement up to a factor 50 can be seen.
  • When traversing a set of distinct objects, retrieving a single object takes about 7-8?s (or a RAMCloud client can request about 130.000 objects/sec from a single RAMCloud server).
  • Invoking the hashTable forEach method comes at a penalty of around 200 ms (presumably to due the Callback).
  • When traversing large amounts of objects the forEach method is 1.6-1.8x faster that looking up each individual object.

#number of objects

client-side aggregation

server-side aggregation
  via hash table lookup

server-side aggregation
 via hash table forEach

10.000

75 ms

2 ms

238 ms

100.000

766 ms

23 ms

251 ms

1.000.000

7604 ms

233 ms

369 ms

10.000.000

76515 ms

2662 ms

1444 ms

100.000.000

770761 ms

30049 ms

18752 ms

  • No labels