On Tue, Dec 15, 2015 at 3:23 PM, Lars Marowsky-Bree <lmb@xxxxxxxx> wrote: > On 2015-12-14T14:17:08, Radoslaw Zarzynski <rzarzynski@xxxxxxxxxxxx> wrote: > > Hi all, > > great to see this revived. > > However, I have come to see some concerns with handling the encryption > within Ceph itself. > > The key part to any such approach is formulating the threat scenario. > For the use cases we have seen, the data-at-rest encryption matters so > they can confidently throw away disks without leaking data. It's not > meant as a defense against an online attacker. There usually is no > problem with "a few" disks being privileged, or one or two nodes that > need an admin intervention for booting (to enter some master encryption > key somehow, somewhere). > > However, that requires *all* data on the OSDs to be encrypted. > > Crucially, that includes not just the file system meta data (so not just > the data), but also the root and especially the swap partition. Those > potentially include swapped out data, coredumps, logs, etc. > > (As an optional feature, it'd be cool if an OSD could be moved to a > different chassis and continue operating there, to speed up recovery. > Another optional feature would be to eventually be able, for those > customers that trust them ;-), supply the key to the on-disk encryption > (OPAL et al).) > > The proposal that Joshua posted a while ago essentially remained based > on dm-crypt, but put in simple hooks to retrieve the keys from some > "secured" server via sftp/ftps instead of loading them from the root fs. > Similar to deo, that ties the key to being on the network and knowing > the OSD UUID. > > This would then also be somewhat easily extensible to utilize the same > key management server via initrd/dracut. > > Yes, this means that each OSD disk is separately encrypted, but given > modern CPUs, this is less of a problem. It does have the benefit of > being completely transparent to Ceph, and actually covering the whole > node. Agreed, if encryption is infinitely fast dm-crypt is best solution. Below is short analysis of encryption burden for dm-crypt and OSD-encryption when using replicated pools. Summary: OSD encryption requires 2.6 times less crypto operations then dm-crypt. Crypto ops are bottleneck. Possible solutions: - make fewer crypto-ops (OSD based encryption can help) - take crypto ops off CPU (H/W accelerators; not all are integrated with kcrypto) Calculations and explanations: A) DM-CRYPT When we use dm-crypt whole data and metadata is encrypted. In typical deployment journal is located on different disc, but is also encrypted. On write data path is: 1) enc when writing to journal 2) dec when reading journal 3) enc when writing to storage So for each byte 2-3 crypto operations are performed (2 can be skipped if kernel's page allocated in 1 has not been evicted). Lets assume 2.5. On read data path we have: 4) dec when reading from storage Balance between reads and writes depends on deployment. Assuming 75% are reads and 25% are writes and replication factor 3. This gives us 1*0.75+2.5*0.25*3=2.625 bytes of crypto operation per byte of disc i/o operation. B) CRYPTO INSIDE OSD When we do encryption in OSD less bytes are encrypted (dm-crypt has to encrypt entire disc sectors); anyway we round it to 1. Read requires 1 byte crypto op per byte. (when data comes from client) Write requires 1 byte crypto op per byte. (when data goes to client) This gives us 1*0.75+1*0.25=1 byte of crypto op per disc i/o. C) OSD I/O performance calculation Lets assume encryption speed 600MB/s per CPU core. (using AES-NI Haswell [1]) This gives us 600/2.625 = 229MB for dm-crypt and 600MB/s for OSD located crypt. Usually there are few discs per CPU core in storage nodes. Lets say 6. 6xHDD=~600MB/s 6xSSD=~6000MB/s It is clear that crypto is limit for speed. https://software.intel.com/en-us/articles/intel-aes-ni-performance-enhancements-hytrust-datacontrol-case-study > > Of course, one of the key issues is always the key server. > Putting/retrieving/deleting keys is reasonably simple, but the question > of how to ensure HA for it is a bit tricky. But doable; people have been > building HA ftp/http servers for a while ;-) Also, a single key server > setup could theoretically serve multiple Ceph clusters. > > It's not yet perfect, but I think the approach is superior to being > implemented in Ceph natively. If there's any encryption that should be > implemented in Ceph, I believe it'd be the on-the-wire encryption to > protect against evasedroppers. > > Other scenarios would require client-side encryption. > >> Current data at rest encryption is achieved through dm-crypt placed >> under OSD’s filestore. This solution is a generic one and cannot >> leverage Ceph-specific characteristics. The best example is that >> encryption is done multiple times - one time for each replica. Another >> issue is lack of granularity - either OSD encrypts nothing, or OSD >> encrypts everything (with dm-crypt on). > > True. But for the threat scenario, a holistic approach to encryption > seems actually required. > >> Cryptographic keys are stored on filesystem of storage node that hosts >> OSDs. Changing them require redeploying the OSDs. > > This is solvable by storing the key on an external key server. > > Changing the key is only necessary if the key has been exposed. And with > dm-crypt, that's still possible - it's not the actual encryption key > that's stored, but the secret that is needed to unlock it, and that can > be re-encrypted quite fast. (In theory; it's not implemented yet for > the Ceph OSDs.) > > >> Data incoming from Ceph clients would be encrypted by primary OSD. It >> would replicate ciphertext to non-primary members of an acting set. > > This still exposes data in coredumps or on swap on the primary OSD, and > metadata on the secondaries. > > > Regards, > Lars > > -- > Architect Storage/HA > SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) > "Experience is the name everyone gives to their mistakes." -- Oscar Wilde > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html