Re: Proposal: Encrypted Namespaces

Jason Dillaman <jdillama@xxxxxxxxxx> · Tue, 4 Aug 2020 09:04:25 -0400

On Mon, Aug 3, 2020 at 4:48 PM Joao Eduardo Luis <joao@xxxxxxx> wrote:
>
>
> This proposal might be a bit thin on details, but I would love to have some
> feedback and gauge the broader community's and developer's interest, as well as
> to poke holes in the current idea.
>
> All comments welcome.
>
>   -Joao
>
>
>
> MOTIVATION
> ----------
>
> Even though we currently have at-rest encryption, ensuring data security on the
> physical device, this is currently on an OSD-basis, and it is too coarse-grained
> to allow different entities/clients/tenants to have their data encrypted with
> different keys.
>
> The intent here is to allow different tenants to have their data encrypted at
> rest, independently, and without necessarily relying on full osd encryption.
> This way one could have anywhere between a handful to dozens or hundreds of
> tenants with their data encrypted on disk, while not having to maintain full
> at-rest encryption should the administrator consider it too cumbersome or
> unnecessary.

I would be interested to hear the tenant use-case where they trust the
backing storage system (Ceph) with all things encryption and don't
have any effective control over the keys / ciphers / rotation policies
/ etc. If you have a vulnerability that exposes the current OSD
dm-crypt keys, I would think it would be possible to get the
per-namespace keys though a similar vector if they are stored
effectively side-by-side?

> While there are very good arguments for ensuring this encryption is performed
> client-side, such that each client actively controls their own secrets, a
> server-side approach has several other benefits that may outweigh a client-side
> approach.
>
> On the one hand,
>
> * encrypting server side means encrypting N times, depending on replication
>   size and scheme;
> * the secrets keyring will be centralized, likely in the monitor, much like
>   what we do for dmcrypt; even though encrypted.
> * on-the-wire data will still need to rely on msgr2 encryption; even though
>   one could argue that this will likely happen regardless of whether a client-
>   or server-side approach is being used.
>
> But on the other,
>
> 1. encryption becomes transparent for the client, avoiding the effort of
>    implementing such schemes in client libraries and kernel drivers;

Just an FYI: krbd supports client-side via dm-crypt, kernel CephFS is
actively looking to incorporate fscrypt, librbd can utilize
QEMU-layered LUKS for many use-cases and work is in-progress on
built-in librbd client-side encryption. RGW has had client-side
encryption for a while.

> 2. tighter control over the unit of data being encrypted, reducing the load of
>    encrypting a whole object versus a disk block in bluestore.

RBD client-side encryption doesn't rely on the underlying object size
(512 bytes for dm-crypt I think and looking at 4KiB blocks for the
librbd built-in encryption). I can't speak for CephFS+fscrypt, but I
suspect it wouldn't require re-encrypting the full file or backing
object (probably 4KiB page).

> 3. older clients will be able to support encryption out of the box, given they
>    will have no idea their data is being encrypted, nor on how that is happening.
>
>
> CHOOSING NAMESPACES
> --------------------
>
> While investigating where and how per-tenant encryption could be implemented,
> two other ideas were on the table:
>
> 1. on a per-client basis, relying on cephx entities, with an encryption key
>    per-client, or a shared key amongst several clients; this key would be kept
>    encrypted in the monitor's kv store with the entity's cephx key.
>
> 2. on a per-pool basis.
>
> The first one would definitely be feasible, but potentially tricky to
> implement just right, without too many exceptions or involvement of other
> portions of the stack. E.g., dealing with metadata could become tricky. Then
> again, there wasn't one reason that could not be addressed and become a
> showstopper.
>
> As for 2., it would definitely be the easiest to implement: pool is created with
> an 'encrypted' flag on, key is kept in the monitors, OSDs encrypt any object
> belonging to that pool. The problem with this option, however, is how
> coarse-grained it is. If we really wanted a per-tenant approach, one would have
> to ensure one pool per tenant. Not necessarily a big deal if a lot of
> potentially small pools is fine. This idea was scrapped in favour of encrypting
> namespaces instead.
>
> Given RADOS already has the concept of a namespace, it might just be the ideal
> medium to implement such an approach, as we get the best of the two options
> above: we can get a smaller-grained access than a pool, but still with the same
> capabilities of limiting access by entity through caps. We also get to have
> multiple namespaces in a single pool should we choose to do so. All the
> while, the concept is high-level enough that the effort of implementing the
> actual encryption scheme might be performed in a select, handful of places,
> without the need for a lot (maybe, any) particular exceptions or corner cases.
>
>
> APPROACH
> ---------
>
> It is important to note that there are several implementations details,
> especially on "how exactly this is going to happen", that have not been fully
> figured out.
>
> Essentially, the objective is to ensure that objects from a given namespace are
> always encrypted or decrypted by bluestore when writing or reading the data. The
> hope that performing at this level will allow us to
>
> 1. ensure the operation is performed at the disk block size, ensuring that
>    small writes, or partial writes, will not require a rewrite of the whole
>    object; same goes for reads.
>
> 2. avoid dealing with all the mechanics involving objects and other operations
>    over them, and focus solely on their data and metadata.
>
> Secret distribution is expected to be done by the monitors, at the OSDs request.
> In an ideal world, the OSDs would know exactly which namespaces they might have
> to encrypt/decrypt, based on pools they currently hold, and request keys for
> those before hand, such that they don't have to request a key from the monitor
> when an operation arrives. This would not only require us to become a bit more
> aware of namespaces, but keeping these keys cached might require the osd to
> keep them encrypted in memory. What to use for that is something that hasn't
> been much thought about -- maybe we could get away with using the osd's cephx
> key.
>
> As for the namespaces, in their current form we don't have much (any?)
> information about them. Access to an object in a namespace is based on prior
> knowledge of that namespace and the object's name. We currently don't have
> statistics on namespaces, nor are we able to know whether an OSD keeps any
> object belonging to a namespace _before_ an operation on such an object is
> handled.
>
> Even though it's not particularly _required_ to get more out of namespaces than
> we currently have, it would definitely be ideal if we ended up with the ability
> to 1) have statistics out of namespaces, as it would imperative if we're using
> them for tenants; and 2) able to cache ahead keys for namespaces an osd might
> have to handle (read, namespaces living in a pool with PGs mapped to a given
> osd).
>
> _______________________________________________
> Dev mailing list -- dev@xxxxxxx
> To unsubscribe send an email to dev-leave@xxxxxxx
>

-- 
Jason
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx