Re: [PATCH v4 3/7] iomap: support direct I/O with fscrypt using blk-crypto

Eric Biggers <ebiggers@xxxxxxxxxx> · Thu, 23 Jul 2020 20:46:28 -0700

On Fri, Jul 24, 2020 at 11:39:10AM +1000, Dave Chinner wrote:
> > fscrypt_set_bio_crypt_ctx() was introduced by
> > "fscrypt: add inline encryption support" on that branch.
> 
> Way too much static inline function abstraction.

Well, this is mostly because we're trying to be a good kernel citizen and
minimize the overhead when our feature isn't enabled or isn't being used.

Eventually we might remove CONFIG_FS_ENCRYPTION_INLINE_CRYPT and make
CONFIG_FS_ENCRYPTION always offer inline crypto support, which would simplify
things.  But for now we'd like to keep the separate option.

Also, previously people have complained about hardcoded S_ISREG() to decide
whether a file needs contents encryption, and said they prefer a helper function
fscrypt_needs_contents_encryption().  (Which is what we have now.)

So we can't make everyone happy...

> 
> fscrypt_inode_uses_inline_crypto() ends up being:
> 
> 	if (IS_ENCRYPTED(inode) && S_ISREG(inode->i_mode) &&
> 	    inode->i_crypt_info->ci_inlinecrypt)
> 
> I note there are no checks for inode->i_crypt_info being non-null,
> and I note that S_ENCRYPTED is set on the inode when the on-disk
> encrypted flag is encountered, not when inode->i_crypt_info is set.
> 

->i_crypt_info is set when the file is opened, so it's guaranteed to be set for
any I/O.  So the case you're concerned about just doesn't happen.

If we're talking about handling a hypothetical bug where it does nevertheless
happen, there are three options:

(a) Crash (current behavior)
(b) Silently leave the file unencrypted, i.e. silently introduce a critical
    security vulnerability.
(c) Return an error, which would add a lot of code complexity because we'd have
    start returning an error from places like fscrypt_set_bio_crypt_ctx() that
    currently can't fail.  This would be dead code that's not tested.

I very much prefer (a), since it's the simplest option, and it also makes the
bug most likely to be reported and fixed, without leaving the possibility for
data to silently be left unencrypted.  Crashing isn't good (obviously), but the
other options are worse.

> Further, I note that the local variable for ci is fetched before
> fscrypt_inode_uses_inline_crypto() is run, so leaving a landmine for
> people who try to access ci before checking if inline crypto is
> enabled. Or, indeed, if S_ENCRYPTED is set and the crypt_info has
> not been set up, fscrypt_set_bio_crypt_ctx() will oops in the DIO
> path.

Sure, this is harmless currently, but we can move the assignment to 'ci'.

> > > > always be a no-op currently (since for now, iomap_dio_zero() will never be
> > > > called with an encrypted file) and thus wouldn't be properly tested?
> > > 
> > > Same can be said for this WARN_ON_ONCE() code :)
> > > 
> > > But, in the interests of not leaving landmines, if a fscrypt context
> > > is needed to be attached to the bio for data IO in direct IO, it
> > > should be attached to all bios that are allocated in the dio path
> > > rather than leave a landmine for people in future to trip over.
> > 
> > My concern is that if we were to pass the wrong 'lblk' to
> > fscrypt_set_bio_crypt_ctx(), we wouldn't catch it because it's not tested.
> > Passing the wrong 'lblk' would cause the data to be encrypted/decrypted
> > incorrectly.
> 
> When you implement sub-block DIO alignment for fscrypt enabled
> filesystems, fsx will tell you pretty quickly if you screwed up....
> 
> > Note that currently, I don't think iomap_dio_bio_actor() would handle an
> > encrypted file with blocksize > PAGE_SIZE correctly, as the I/O could be split
> > in the middle of a filesystem block (even after the filesystem ensures that
> > direct I/O on encrypted files is fully filesystem-block-aligned, which we do ---
> > see the rest of this patchset), which isn't allowed on encrypted files.
> 
> That can already happen unless you've specifically restricted DIO
> alignments in the filesystem code. i.e. Direct IO already supports
> sub-block ranges and alignment, and we can already do user DIO on
> sub-block, sector aligned ranges just fine. And the filesystem can
> already split the iomap on sub-block alignments and ranges if it
> needs to because the iomap uses byte range addressing, not sector or
> block based addressing.
> 
> So either you already have a situation where the 2^32 offset can
> land *inside* a filesystem block, or the offset is guaranteed to be
> filesystem block aligned and so you'll never get this "break an IO
> on sub-block alignment" problem regardless of the filesystem block
> size...
> 
> Either way, it's not an iomap problem - it's a filesystem mapping
> problem...
> 

I think you're missing the point here.  Currently, the granularity of encryption
(a.k.a. "data unit size") is always filesystem blocks, so that's the minimum we
can directly read or write to an encrypted file.  This has nothing to do with
the IV wraparound case also being discussed.

For example, changing a single bit in the plaintext of a filesystem block may
result in the entire block's ciphertext changing.  (The exact behavior depends
on the cryptographic algorithm that is used.)

That's why this patchset makes ext4 only allow direct I/O on encrypted files if
the I/O is fully filesystem-block-aligned.  Note that this might be a more
strict alignment requirement than the bdev_logical_block_size().

As long as the iomap code only issues filesystem-block-aligned bios, *given
fully filesystem-block-aligned inputs*, we're fine.  That appears to be the case
currently.  I was just pointing out that some changes might be needed to
maintain this property in the blocksize > PAGE_SIZE case (which again, for
encrypted files is totally unsupported at the moment anyway).

(It's possible that in the future we'll support other encryption data unit
sizes, perhaps powers of 2 from 512 to filesystem block size.  But for now the
filesystem block size has been good enough for everyone, and there would be a
significant performance hit when using a smaller size, so there hasn't been a
need to add the extra layer of complexity.)

- Eric