Data model
Data Models
A list of data models used by various web-storage systems.
Block-based storage
Example: SANs LUNs - Linear array of fixed size blocks
Name space: (array#, LUN, block number, snapshot)
Operations: Read Block, Write Block
Blob Stores Example: Amazon's S3 Store blobs of data (0 to 5GB of size)
Name Space: (Bucket, Key)
Operations: Get/Put/Delete objects - Entire object update only Memcache
Data: Blocks identified by a key
Memcache Operations: (set, add, replace, append, prepend, cas, get, gets, delete, intr, decr)
Blobs with attributes store
Example: SimpleDB Blobs with attribute-value pairs
GET, PUT or DELETE items in your domain, along with the attribute-value pairs
Query on objects with various lexicographical queries
Big Table
Sparse Multidimensional sorted map
(row:string, column:string, time:int64) -> string
Column/Row database tradeoff
Row key unique - locality, ordered on it (row, column family, column qualifier, and timestamp) Different numbers of columns per rows. Hybrid column/row oriented storage (user-specified locality)
Document-oriented database
Scheme free. Example: CouchDB
JSON objects
Data types: All or nothing update of documents Views - JavaScript (Map of map reduce) Add structure back Query-able and index-able, featuring a table oriented reporting engine that uses Javascript as a query language.
MongoDB(JSON DSON, with Indexs, Nested URL structure)
SQL data services Relational data model
Simplied relation data (SELECT - toss hard parts)
Object Oriented Database Structure: Arbitrary graph
Message queues
File Servers
Unix or Windows file system data models
Streaming Video Servers
Traditional relational model?
-
- Relational databases tend to fragment data into lots of small pieces. For example, consider an order with order items; each order item will be a separate record in a table.
- In a distributed system like RAMCloud, each fragment is likely to end up on a different server, resulting in lots of requests to collect an interesting amount of data.
- The distribution also exacerbates consistency issues during updates.
- Opaque variable-length blobs, like memcached?
- Hierarchical hashes (JSON, Fiz datasets)?
- In this model an entire order, including the main order and its items, would be a single object stored on a single server.
- Does it make sense to support multiple tables, or is this a flat store that simply maps ids to objects?
- Should RAMCloud be designed for small objects only? Any upper limit on size?