I attended the “Encrypting MySQL data at Google” session with Jeremy Cole and Jonas Oreland both from Google. They started with the why of encrypting data. The threat they are trying to protect from are:
- access through network APIs (mysql client …) — not protecting against this
- Access within from a running server (ptrace, memory dumping, etc.)– not protecting against this
- Lost or misplaced disks — ARE PROTECTING
- Backups — ARE PROTECTING
Not all threats are feasible to protect against. An attacker with unlimited network access and time will be able own you. And if they can get root they can own you. You could encrypt data in the columns from the application but a lot of work and you cannot access your data via SQL. It is also incompatible with 3rd party products. You can purchase middleware products to provide column encryption. (MyDiamo and CryptDB were given as examples). Indexing because an issue with these approaches. You could just encrypt all the disks but when mounted it would be unencrypted.
In their approach they wanted to encrypt all user data to include temporary tables and InnoDB data and log files. They wanted to make drive-by data exfiltration impossible. An attacker will need data and keys to decrypt. InnoDB organized in 16K pages and they encrypt all pages except for page 0. They do not encrypt page headers or page trailers. They also do not use the same key to encrypt multiple similar pieces of data to avoid being able to make guesses on known attributes of InnoDB pages (i.e. lots of zeros in the middle of a page due to initialization). They encrypt all log blocks but the first 4. Each block is encrypted with a different key.
They have a Google key management infrastructure so you will need to role your own as theirs is not open source. Keys are never stored on disk. Only used in memory. They did however public a key management plugin interface to allow for someone else to write an open source solution. They rotate keys on the redo logs by writing dummy records to age out blocks as needed. Temporary Tables should age themselves out within MySQL. Binary and relay logs are encrypted by the latest key and similarly age out. For InnoDB data they keep the key version in the page header and they have background threads that re-encrypt pages as needed to rotate keys. The number of threads and how much IOPs should be used for rotation are both configurable.
The encryption works with replication as well. Code available at code.google.com/p/google-mysql.