Re: [PATCH v4 3/7] iomap: support direct I/O with fscrypt using blk-crypto

Eric Biggers <ebiggers@xxxxxxxxxx> · Wed, 22 Jul 2020 16:43:12 -0700

On Wed, Jul 22, 2020 at 04:32:47PM -0700, Darrick J. Wong wrote:
> On Wed, Jul 22, 2020 at 04:26:25PM -0700, Eric Biggers wrote:
> > On Wed, Jul 22, 2020 at 03:34:04PM -0700, Eric Biggers wrote:
> > > So, something like this:
> > > 
> > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > index 44bad4bb8831..2816194db46c 100644
> > > --- a/fs/ext4/inode.c
> > > +++ b/fs/ext4/inode.c
> > > @@ -3437,6 +3437,15 @@ static int ext4_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
> > >  	map.m_len = min_t(loff_t, (offset + length - 1) >> blkbits,
> > >  			  EXT4_MAX_LOGICAL_BLOCK) - map.m_lblk + 1;
> > >  
> > > +	/*
> > > +	 * When inline encryption is enabled, sometimes I/O to an encrypted file
> > > +	 * has to be broken up to guarantee DUN contiguity.  Handle this by
> > > +	 * limiting the length of the mapping returned.
> > > +	 */
> > > +	if (!(flags & IOMAP_REPORT))
> > > +		map.m_len = fscrypt_limit_io_blocks(inode, map.m_lblk,
> > > +						    map.m_len);
> > > +
> > >  	if (flags & IOMAP_WRITE)
> > >  		ret = ext4_iomap_alloc(inode, &map, flags);
> > >  	else
> > > 
> > > 
> > > That also avoids any confusion between pages and blocks, which is nice.
> > 
> > Correction: for fiemap, ext4 actually uses ext4_iomap_begin_report() instead of
> > ext4_iomap_begin().  So we don't need to check for !IOMAP_REPORT.
> > 
> > Also it could make sense to limit map.m_len after ext4_iomap_alloc() rather than
> > before, so that we don't limit the length of the extent that gets allocated but
> > rather just the length that gets returned to iomap.
> 
> Naïve question here -- if the decision to truncate the bio depends on
> the file block offset, can you achieve the same thing by capping the
> length of the iovec prior to iomap_dio_rw?
> 
> Granted that probably only makes sense if the LBLK IV thing is only
> supposed to be used infrequently, and having to opencode a silly loop
> might be more hassle than it's worth...
> 

We *could* do the truncation there, but that would truncate the actual read() or
write().  So, userspace would see a short read or write.  And I understand that
while applications are *supposed* to handle short reads and writes, many don't.

I think Dave's suggestion makes more sense, since it would make this case be
treated just like normal fragmentation of the file.

- Eric