Re: [PATCH 0/2] iomap: Waiting for IO in iomap_dio_rw()

Jan Kara <jack@xxxxxxx> · Thu, 10 Oct 2019 11:18:31 +0200

On Thu 10-10-19 10:02:27, Dave Chinner wrote:
> On Wed, Oct 09, 2019 at 10:41:24PM +0200, Jan Kara wrote:
> > Hello,
> > 
> > when doing the ext4 conversion of direct IO code to iomap, we found it very
> > difficult to handle inode extension with what iomap code currently provides.
> > Ext4 wants to do inode extension as sync IO (so that the whole duration of
> > IO is protected by inode->i_rwsem), also we need to truncate blocks beyond
> > end of file in case of error or short write. Now in ->end_io handler we don't
> > have the information how long originally the write was (to judge whether we
> > may have allocated more blocks than we actually used) and in ->write_iter
> > we don't know whether / how much of the IO actually succeeded in case of AIO.
> > 
> > Thinking about it for some time I think iomap code makes it unnecessarily
> > complex for the filesystem in case it decides it doesn't want to perform AIO
> > and wants to fall back to good old synchronous IO. In such case it is much
> > easier for the filesystem if it just gets normal error return from
> > iomap_dio_rw() and not just -EIOCBQUEUED.
> 
> Yeah, that'd be nice. :)
> 
> > The first patch in the series adds argument to iomap_dio_rw() to wait for IO
> > completion (internally iomap_dio_rw() already supports this!) and the second
> > patch converts XFS waiting for unaligned DIO write to this new API.
> > 
> > What do people think?
> 
> I've just caught up on the ext4 iomap dio thread where this came up,
> so I have some idea of what is going on now :)
> 
> My main issue is that I don't like the idea of a "force_wait"
> parameter to iomap_dio_rw() that overrides what the kiocb says to
> do inside iomap_dio_rw(). It just seems ... clunky.
> 
> I'd much prefer that the entire sync/async IO decision is done in
> one spot, and the result of that is passed into iomap_dio_rw(). i.e.
> the caller always determines the behaviour.
> 
> That would mean the callers need to do something like this by
> default:
> 
> 	ret = iomap_dio_rw(iocb, iter, ops, dops, is_sync_kiocb(iocb));
> 
> And filesystems like XFS will need to do:
> 
> 	ret = iomap_dio_rw(iocb, iter, ops, dops,
> 			is_sync_kiocb(iocb) || unaligned);

Yeah, I've considered that as well. I just didn't like repeating
is_sync_kiocb(iocb) in all the callers when all the callers actually have
to have something like (is_sync_kiocb(iocb) || (some special conditions))
to be correct. And in fact it is not a definitive decision either as
iomap_dio_rw() can decide to override caller's wish and do the IO
synchronously anyway (when it gets -ENOTBLK from the filesystem). That's why
I came up with 'force_wait' argument, which isn't exactly beautiful either, I
agree.

> and ext4 will calculate the parameter in whatever way it needs to.
> 
> In fact, it may be that a wrapper function is better for existing
> callers:
> 
> static inline ssize_t iomap_dio_rw()
> {
> 	return iomap_dio_rw_wait(iocb, iter, ops, dops, is_sync_kiocb(iocb));
> }
> 
> And XFS/ext4 writes call iomap_dio_rw_wait() directly. That way we
> don't need to change the read code at all...

Yeah, this is similar to what I had in my previous version [1]. There I had
__iomap_dio_rw() with bool argument, iomap_dio_rw() passing is_sync_kiocb(iocb)
to __iomap_dio_rw() (i.e., fully backward compatible), and iomap_dio_rw_wait()
which executed IO synchronously. But Christoph didn't like the wrappers.

I can go with just one wrapper like you suggest if that's what people
prefer. I don't care much we just have to settle on something...

								Honza

[1] https://lore.kernel.org/linux-ext4/20191008151238.GK5078@xxxxxxxxxxxxxx/
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR