Security

Background

From spring quarter '09 discussions:

Goals

RAMCloud is designed as a cluster service. As such multi-tenancy is an important consideration. Multi-tenancy implies:

There may be mutually distrustful, even antagonistic, users of RAMCloud.
There may be mutually distrustful other clients or machines on the network.

We'll need some sort of secure authentication, as well as access control to constrain users of RAMCloud. We will also need to consider threats from other hosts.

Internal RC Security

Beyond maintaining security of RC accounts and data, as well as securing transactions between clients and masters, we must keep in mind backups and other services. Users must not be able to forge segments to a backup server, request to recover them, interact with cluster coordinators and other sensitive internal services.
- Much of this can probably be achieved with network filtering.
- For non-latency critical paths, encryption is probably workable.

Assumptions

The network can provide us sufficient isolation such that there is no:
- Snooping - packets cannot be read by unintended hosts and the network cannot be fooled into routing packets to the incorrect host (e.g. ARP spoofing).

If we don't trust the network, we'll likely need heavy crypto, which is contrary to our performance goals.

Authentication Proposal

Performance is paramount, so we must ensure that authentication and access control mechanisms are efficient (in time and space).

Two isomorphic proposals work as follows:

Clients authenticate with an authorization service and are returned a >= 64-bit secret.
1. The secret is passed with each RPC. Masters verify the secrets with the authentication services and cache results in a hash table for speed.
Clients authenticate with an authorization service, but are returned an encrypted or signed token that identifies their principal. The master can verify the self-certifying token and need not contact the authorization service itself.

Trade-offs between the two are size of the secret or tokens and the cost of crypto vs. a hash table lookup (the former may be very fast due to data being in cache and new SSE AES instructions, the latter will likely not miss the cpu cache due to a high level of temporal locality in accesses and a relatively small number of principals accessing each machine).

Protection Granularity

Two obvious choices:

Per-table
Per-object

Per-object seems like overkill and would potentially incur a great deal of space overhead. Per-table, however, is efficient and easily implemented.

Access Control

Types of Actions:

Within a Table (on objects):
- Create object
- Update object
- Read object
- Delete object
- Create index
- Modify index
- Delete index

Within a greater scope (application scope - not currently defined in RC):
- Create table
- Delete table
- Create user
- Delete user
- Delegate privileges

If we choose a more complex access control structure, a question arises as to how users are organized. For example:

Do we want groups?
Can users be in multiple groups?
Can groups contain groups?
Do we want roles instead?
Should users only be able to assume one role at a time?

Conclusions

We were skeptical that any proposed complicated model would be either useful or appropriate for most users. A simple model of having single principals authenticated with full access to a set of tables they own (i.e. like a full access user for each SQL database) would go a long way.
For the time being, we will punt access control and accounting and assume a trusted data center / single user model.

Interesting bits

BigTable apparently
- has no delegation
- does authentication with a Google campus-wide mechanism
- showed read access permissions more important than write (secrecy valued over integrity)

Questions

What does Amazon S3 do?
Are our network security assumptions sound?