Re: [PATCH v2 1/3] dm-inlinecrypt: Add inline encryption support

Adrian Vovk <adrianvovk@xxxxxxxxx> · Thu, 24 Oct 2024 16:45:31 -0400

On Thu, Oct 24, 2024 at 3:21 PM John Stoffel <john@xxxxxxxxxxx> wrote:
>
> >>>>> "Adrian" == Adrian Vovk <adrianvovk@xxxxxxxxx> writes:
>
> > On Thu, Oct 24, 2024 at 4:11 AM Geoff Back <geoff@xxxxxxxxxxxxxxx> wrote:
> >>
> >>
> >> On 24/10/2024 03:52, Adrian Vovk wrote:
> >> > On Wed, Oct 23, 2024 at 2:57 AM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
> >> >> On Fri, Oct 18, 2024 at 11:03:50AM -0400, Adrian Vovk wrote:
> >> >>> Sure, but then this way you're encrypting each partition twice. Once by the dm-crypt inside of the partition, and again by the dm-crypt that's under the partition table. This double encryption is ruinous for performance, so it's just not a feasible solution and thus people don't do this. Would be nice if we had the flexibility though.
> >>
> >> As an encrypted-systems administrator, I would actively expect and
> >> require that stacked encryption layers WOULD each encrypt.  If I have
> >> set up full disk encryption, then as an administrator I expect that to
> >> be obeyed without exception, regardless of whether some higher level
> >> file system has done encryption already.
> >>
> >> Anything that allows a higher level to bypass the full disk encryption
> >> layer is, in my opinion, a bug and a serious security hole.
>
> > Sure I'm sure there's usecases where passthrough doesn't make sense.
> > It should absolutely be an opt-in flag on the dm target, so you the
> > administrator at setup time can choose whether or not you perform
> > double-encryption (and it defaults to doing so). Because there are
> > usecases where it doesn't matter, and for those usecases we'd set
> > the flag and allow passthrough for performance reasons.
>
> If you're so concerend about security that you're double or triple
> encrypting data at various layers, then obviously skipping encryption
> at a lower layer just because an upper layer says "He, I already
> encrypted this!" just doesn't make any sense.

I'm double or triple encrypting data at various layers to give myself
a broader set of keys that I can work with and revoke individually,
not because encrypting the same data more than once makes the
encryption more secure. I expressly _don't_ want to encrypt the data
multiple times, I just want the flexibility to wipe keys from memory
and make parts of the filesystem tree cryptographically inaccessible.

For example: my loop devices. I'd like to stack three layers of
encryption: the backing filesystem is encrypted, the loop devices are
encrypted, and inside the loop devices we use fsverity. Specifically
in my use-case: the backing filesystem is the root filesystem, each
loopback file is a user's home directory, and each folder with fscrypt
encryption belongs to a single app. The root filesystem is protected
with generic full-disk-encryption, and contains both the home
directories and other lightly-sensitive data that can be shared
between users (WiFi passwords, installed software). The
full-disk-encryption key is in memory for as long as the system is
booted, so the data is protected while the system is powered off. Then
on top of that I have the home directories, which are in an encrypted
loopback file encrypted with a key derived from the user's login
password. We deem not only file names and contents sensitive, but also
directory structures and file metadata (xattrs, etc), so fsverity is
not an option for us. The user's encryption key is in memory for as
long as the user is logged in. Note that this is a stronger protection
than the rootfs's encryption: each user's data is protected not only
when the device is off, but also when the user is logged out. A
similar pattern repeats for the fscrypt dirs: each app gets an
fsverity-protected data directory, and keys are only given to the app
while it's running and while the user's session is unlocked. When the
user locks their device, or when an app is closed, that app's keys are
wiped from memory. This way, an attacker can get their hands on a
booted and logged-in device, but as long as it's locked they won't be
able to extract the necessary keys to read your banking app's login
credentials and steal your money (for example...)

That's the encryption scheme I'd like to implement on the Linux
desktop. We're part of the way there already, and we've hit the
double-encryption barrier. Note that none of the security of this
scheme comes from the fact that data is encrypted twice. Again we
don't want the data to be encrypted multiple times: the performance
cost that this would incur is a blocker to implementing this
encryption scheme. We stack encryption so that we can revoke parts of
the key hierarchy, not to actually stack encryption.

> So how does your scheme defend against bad actors?

Hopefully my explanation above makes my threat model and use-case a
bit more clear.

> I'm on a system with an encrypted disk.  I make a file and mount it with loop, and the
> encrypt it.  But it's slow!  So I turn off encryption.  Now I shutdown
> the loop device cleanly, unmount, and reboot the system.  So what
> should I be seing in those blocks if I examine the plain file that's
> now not mounted?

Depends on what you mean by "turn off encryption".

I'm going to assume you mean that you just turned on the passthrough
mode in the lower dm-default-key table and did nothing else? In this
case, if you read the loopback file you'll get the ciphertext of the
lower encryption layer, or in other words you'll see the ciphertext
resulting from encrypting the loopback file twice. If you try to mount
the loopback file as you normally do, you'll just see gibberish data,
and the data will be inaccessible until you turn passthrough back off.
You already wrote data to disk that was encrypted twice, and now
you're telling the lower layer to do nothing for those block ranges.
So, the upper layer will see the lower layer's ciphertext, instead of
its own. It'll decrypt the lower layer's ciphertext with the upper
layer's key, and return the gibberish result that this operation
produces. New data you write to the loopback file will work as it's
supposed to: it'll be encrypted once, and can be read back fine.
Similar things will happen if you flip the passthrough flag from on to
off as well.  Changing the passthrough mode flag on a lower layer
requires a reformat of the data inside. It's similar behavior to
setting up dm-crypt with the wrong encryption algorithm or the wrong
key.

Let's say that you _do_ reformat the loopback file when you turn on
passthrough mode. So now you have passthrough mode on, and all data is
encrypted once on disk. Then you try to read the loopback file. You'll
see the same ciphertext that you'd see if you were stacking two
instances of dm-crypt on top of each other. The only difference from
dm-crypt is that this ciphertext is stored directly on disk, instead
of being encrypted a second time. So if you get the LBAs on disk, then
read the disk directly on those LBAs, you'll see the same ciphertext
as reading it through a filesystem. That's the difference.

> Could this be a way to smuggle data out because now the data written
> to the low level disk is encypted with a much weaker algorithm?  So I
> can just take the system disk and read the raw data and find the data?

The weakest encryption algorithm is just storing the plaintext on
disk; XOR with a null key, if you will. Let's say that there's an
out-of-tree device-mapper driver for this, called dm-fake-encryption.
You're asking what would happen if we put an instance of
dm-fake-encryption on top of a passthrough-enabled dm-default-key.

Well, then data that went through dm-fake-encryption will be stored in
plaintext on disk, and data that doesn't go through dm-fake-encryption
will instead go through dm-default-key and get properly encrypted on
disk. If an attacker gets the disk, then all the data that was written
through the dm-fake-encryption layer will be stored in plaintext on
disk and the attacker will be able to read it. Data that didn't go
through dm-fake-encryption will be encrypted on disk and the attacker
will be unable to read it.

However, I don't understand the risk of smuggling this introduces. In
situations where smuggling data through layers of encryption is a risk
factor to worry about (a cloud VPS host?), the administrator would
just disable passthrough. With passthrough off on the layer below,
dm-fake-encryption's signalling is ignored and the data is
double-encrypted anyways. No smuggling possible then.

Could you walk me through a specific attack scenario for this that you
have in mind? Because I can't really think of one, so it's hard to
think of a solution.

Best,
Adrian