/
Data stores reference for Indexing

Data stores reference for Indexing

Ankita's quick guide (still a WIP) to key features in data stores, with focus on indexing support. (If you find something to be inaccurate or know another data store that does well on most/all the parameters, let me know!)

Data Store (& Link)

IndexedIn-memDist/Scalable data, indexDurableLatencyThroughputConsistencyAPIAdditional comments
RAMCloudyyy, yy [n]??Linearizable

keys - val


MySQL SEyny, ? (replication)y

Transaction safe, ACID compliantrelational
MySQL Clusteryny, ? (replication & partitioning)y


SQL and NoSQL
Cassandraycachey, y*y

BASE (Basically Available Soft-state Eventual Consistency) + Can choose consistency levelpartitioned rows (similar to sql) + denormalization + materialized views*: each node indexes data it holds locally
MongoDBycache
y


document oriented storage: JSON-style docs with dynamic schemas
H-Store? (1)yy, ?



row-based-relational
VoltDB? (1)yy, ?


ACID for transactions; unclear otherwise.relationalCommercial H-store
G-Storen







LevelDBn





key-val
Spannern




Externally-consistent

F1yny, ?y> mysql
Strong in gen, consistent global indexesrelational + sqlUses Spanner; Google ads used MySQL. This made their db scalable.

PNUTS [paper]

kind-of*ny, -y (2)

RelaxedBasic relational*: Optional secondary table lazily maintained; keyed on index key
DynamoDByn (ssd)y, ?y (2)single digit ms
Strong consistency on readstables, no fixed schemas. each item: diff num of attrs
BigTablen




Strongmulti dimensional map which supports basic operations
Espressoyny, ?y

Timeline-consistentdocument oriented NoSQL; has secondary indexUses MySQL/InnoDB as storage engine. Also uses Lucene+Databus+Helix
Postgres? (1)ny, ?y (2)

ACID compliant, MVCCobject-relational
COPSn




Causal+key-val
Eigern




Causalcolumn-store
Hyperdex [paper]yny, ?


Key ops are linearizable; Warp (commercial extension) has ACID transactionskey-val; rich datatypes
H-Basen
y, -


Strictly consistent reads and writesversioned, non-relational; has global and local indexesModeled after BigTable
Cloudera Impala? (1)ny, ?



SQL interfaceMassively parallel processing arch for perf w/ Hadoop Scalability
Redisnyy, -optional


key-val; keys can contain strings, hashes, lists, sets, sorted sets
CouchDBy
y, ?y

Eventually consistentJSON
BigCouchy






Commercial couchdb on steroids
Voldermortncacheyy


hash table
NuDB






relational?

Footnotes:

(1): Probably yes since they claim to be relational / sql.

(2): Probably yes since it is not in-memory.

Sources:

Websites/papers linked in the first column and official blog/wikis.