Redis vs. RAMCloud
Redis is a key-value store that keeps all data in memory, so in many ways it looks pretty similar to RAMCloud. This page compares RAMCloud to Redis.
Data Model
Redis has a richer data model than RAMCloud. For example, in addition to the simple get-set semantics provided by RAMCloud, Redis also provides the following operations:
- Atomic increment (RAMCloud provides this as well).
- Treat values as sets: add to set, remove from set, union, etc.
- Treat values as sorted sets: each element in a set has a "score", which can be used to order the elements.
- Treat values as lists: push, pop, index, range, etc.
- Treat values as bit strings (e.g. count "on" bits).
- Treat values as hash tables.
- Transactions involving multiple operations.
- Publish-subscribe
- Expiration times for objects.
Performance
Redis appears to have throughput of 100K-1M operations/second, which is in the same ballpark as one RAMCloud server. Latency for reads appears to be in the range of 200 microseconds (using 1Gb Ethernet). RAMCloud latency is much faster (5us for reads, 15us for writes), but only with faster networking technologies. With 1Gb Ethernet, RAMCloud is still faster than Redis but less than 2x faster.
Scalability
Redis appears to have been optimized for single-server deployments. If the system scales beyond a single server, then data must be partitioned among the servers:
- Both range partitioning and hash partitioning are supported
- Partitioning can be handled by having clients choose the correct server, by forwarding all requests to a proxy that retransmits to the appropriate server (with additional latency), or by sending requests to a random server, which then redirects them to the right server (again, with a latency penalty).
- Several of the Redis operations do not work in a partitioned world, such as multi-object transactions.
- Overall, it looks like scaling introduces problems both with performance and with the data model.
RAMCloud was designed and optimized for scale (hundreds or thousands of servers). All of RAMCloud's (admittedly simpler) operations work at scale, and performance is not impacted. Scaling in RAMCloud is transparent: clients do not need to be aware of the size of the system. In some ways, RAMCloud gets better as it gets larger: for example, crash recovery from a failed server gets faster as the cluster size increases.
Durability
Redis' durability model is relatively weak:
- It offers both snapshotting and operation logging, but in most cases data will be lost after crashes (even with logging enabled, the logs typically are not flushed synchronously during each operation, resulting in some data loss after crashes). The only way to eliminate loss of data is to use logging with the most conservative flushing mode (sync before responding to client); this introduces a large performance penalty, and it sounds like it's not normally used.
- Redis offers master-slave replication, but it isn't synchronous, so data can be lost during crashes. It's unclear whether there is automatic failover after a master crash.
- Scaling introduces administrative hassles: each server's backup data must be managed separately.
RAMCloud provides better durability:
- Automatic replication, crash recovery, and fail-over (no loss of availability if a server fails)
- Stronger durability guarantee: data is always replicated and durable before operations return, without significant performance hit (subject to the requirement for persistent buffers on backups).
- No administrative issues for scalability.
Consistency
It does not appear that Redis provides linearizable semantics for operations; the behavior of operations may change in the face of server crashes and restarts.
RAMCloud is designed for full linearizability: crashes and restarts do not affect client-visible behavior.