Re: Proposal: Encrypted Namespaces

Jan Fajerski <jfajerski@xxxxxxxx> · Thu, 6 Aug 2020 18:55:29 +0200

On Wed, Aug 05, 2020 at 02:38:09PM +0000, Joao Eduardo Luis wrote:
On 20/08/05 08:29AM, Jeff Layton wrote:
On Wed, 2020-08-05 at 11:22 +0000, Joao Eduardo Luis wrote:
> On 20/08/04 01:16PM, Jeff Layton wrote:
> > On Tue, 2020-08-04 at 09:04 -0400, Jason Dillaman wrote:
> > > On Mon, Aug 3, 2020 at 4:48 PM Joao Eduardo Luis <joao@xxxxxxx> wrote:
> > > > Even though we currently have at-rest encryption, ensuring data security on the
> > > > physical device, this is currently on an OSD-basis, and it is too coarse-grained
> > > > to allow different entities/clients/tenants to have their data encrypted with
> > > > different keys.
> > > >

Is it really valuable to allow different entities to use different keys,
when the OSDs have access to all of them? If the OSD's master store of
keys is compromised then all of the tenant data is compromised. This
wouldn't necessarily be the case with a scheme where the encryption is
done on the clients.

I mean, this is what we're doing for dmcrypt anyway, is it not?

I don't think the idea of encrypting namespaces should be seen as the
ultimate alternative to encrypting client-side, but as a mechanism similar to
what we do with at-rest encryption with dmcrypt, except on subdivisions of a
pool.

> That is a very fair point, and I don't disagree. I do think that having the
> client handling their own encryption, and being in control of their secrets,
> is the safest approach.
>
> However, the use cases brought to us have been somewhat like what we have with
> at-rest encryption with dmcrypt: the provider controls the keys, and retiring
> a tenant is simply a matter of destroying their key.
>
> Mind you, this is something we could achieve with pools, if the granularity
> wasn't so coarse IMO.
>

I guess I'm not 100% clear on the use-case for this.

Are there really folks just interested in knowing that the OSD will
store the data encrypted and don't really care that it may deal with
unencrypted data in its memory? Is this being driven by some sort of
regulatory requirement?

Just to expand on Joao's points a little. Afaiu the use case can be considered 
as a quick and safe data delete for tenants. Delete might be the wrong term here 
but "make data inaccessible" is a bit clunky. I think something regulatory plays 
a role here but essentially one requirement this can address is to quickly and 
reliably make a tenants data inaccessible (by deleting the encryption key).
I think this probably shouldn't be labeled as encryption in the sense of keeping 
data confidential in the presence of adversarial agents, but rather encryption 
can be looked at as a means to an end...safely deleting data at a certain time.

AFAIK, this is already the case for the at-rest encryption. The data is still
unencrypted in memory, and folks are just fine as long as the data is
encrypted at rest. The principle is essentially the same, I think.

> I'm guessing these filenames are being kept somewhere in an object in the
> object store? Would this also pose issues with xattrs and other metadata?
>

No, with fscrypt, the filenames are encrypted in place, such that the
filenames themselves are encrypted and you need a key to see their
unencrypted forms in a readdir().

What is sort of cool is that you can still access the tree without a key
at all, but all you see is encrypted filenames, and you can't do
anything with the files (other than unlink(), assuming you have the
correct permissions).

The main downside of fscrypt is that it can't encrypt metadata at all.
Stuff like the mtime or file size is under the purview of the filesystem
(the Ceph MDS in our case). It'd also be dangerous to (e.g.) show a
scrambled mode or ownership for the file, as you might not be able to
predict how applications might behave.

> Solely as an exercise, should the client encrypt the data and write it to the
> cluster, and write the metadata to an encrypted namespace, I believe the
> object itself could simply be encrypted as a whole, transparently for the
> client.
>

Encrypting metadata is going to pretty much be impossible from a solely
client-side solution. The client doesn't have much control over stuff
like mtime. That's all managed on the server side.

> Granted, at this point I'm not entirely sure how hard it would be to,
> or if it's even feasible to, encrypt the internal metadata used by bluestore,
> but from my conversation with Igor it seemed that encrypting the object's
> metadata would be feasible.

Yeah, if you have the OSD doing the encryption for you, then it could
(theoretically) store encrypted metadata too.

One thing you should consider as well: What will you do if someone has a
bunch of encrypted data and the key is (somehow) lost? You'll need some
way to be able to blow away old objects that can't be accessed anymore.

That is one of the reasons why I think namespaces need to become a bit more
self-aware: should one need to remove all objects belonging to a namespace, at
the moment, without knowing the objects beforehand, AFAIK one will need to
traverse a pool to find all the objects one by one.

But to your point, say that we are encrypting a namespace and the encryption
key is lost; that namespace is essentially useless now. The only reasonable
action is to remove the namespace's objects (although I'd leave that decision
to the administrator), even if, in the worst case scenario, one has to iterate
over a pool and check object by object whether it belongs to a namespace to be
destroyed.
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx

--
Jan Fajerski
Senior Software Engineer Enterprise Storage
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx