Online schema changes

  • One of the problems with databases is schema management. Changes to a database schema are highly disruptive:
    • Bring the system down.
    • Make a collection of changes to the schema.
    • Bring the system back up again.
    • Discover you made a mistake, at which point your system is unusable until you can fix it.
  • For RAMCloud the system must be able to evolve with no downtime.
  • A fixed scheme that probably doesn't make sense.
  • One alternative: make the individual data items self-identifying enough that the schema can evolve gracefully:
    • Perhaps there is no schema: individual records are variable-length with variable columns and completely self-identifying.
    • Store a version number in each record?
  • What is the state-of-the-art for schema changes in relational databases today? Can they be done without bringing the system down at all?