Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. When a tombstone is found, for each log regarding the object,
    1. if recent ack_id of its client is higher than rpc_id of the log element, delete the whole log element.
    2. if it is lower than rpc_id, just delete object in log and compact. Leave metadata <table_id, keyHash> and <client_id, rpc_id, ack_id>.
  2. For every iteration on an entry,
 
 
 

1. Idea. How to avoid duplicate processing.
Duplicate processing of an RPC (usually due to re-tried RPCs) is avoided by
assigning a unique id for each RPC from a client. A master service keeps the
RPC's id number and its accompanying result, and just reply to duplicate RPCs
with the previously saved results.
To reduce space required to keep such data, a client "acknowledges" its
receipt of RPC results and guarantees it will not re-try the same RPCs.
This is done by attaching an "acknowledgement number" (aka. ack id) to each
RPC request. The number tells RPCs whose ids are smaller than or equal to
the ack id are acknowledged by this client.

Missing: Logging on disk?

2. Mechanisms on Master

Actions

<Check duplicate> Yes: <Process>, <Record Completion> No: <Reply with saved result> - etc exceptions
<Write on log> needs atomicity to guarantee consistency.

<Cleaner> does log cleaner need atomicity?


2.1 Memory data structure.
Each master keeps a copy of #UnackedRpcResults object to store the results of
linearizable RPCs. As a master receives an RPC request, it will check whether
the same RPC is in progress or completed by checkDuplicate(). As the processing
of the RPC is finished, master records its completion on memory by
recordCompletion(). On backup storage, it atomically writes both the result of
RPC and the log of the rpc's completion.

 

client_idprocessed_list
1
<start of processed rpc_id sequence, end of the sequence>...
2
<11,15><17,17><20,26>
3
<1,1>

Figure 1. Processed table: Each master keeps processed_lists for each client. The processed_list tracks the processed rpc_id (only not yet ACKed) by keeping sequence start and end.

 

...

cond_write

<TableID, KeyHash>
New Object
<Client_id, Rpc_id, Ack_id>

...

Tombstrone

<table_id, keyHash>

<Client_id, rpc_id, ack_id>

...

Figure 2. Log structure with RPC id and ACK id.