On Wed, Oct 09, 2019 at 10:41:24PM +0200, Jan Kara wrote: > Hello, > > when doing the ext4 conversion of direct IO code to iomap, we found it very > difficult to handle inode extension with what iomap code currently provides. > Ext4 wants to do inode extension as sync IO (so that the whole duration of > IO is protected by inode->i_rwsem), also we need to truncate blocks beyond > end of file in case of error or short write. Now in ->end_io handler we don't > have the information how long originally the write was (to judge whether we > may have allocated more blocks than we actually used) and in ->write_iter > we don't know whether / how much of the IO actually succeeded in case of AIO. > > Thinking about it for some time I think iomap code makes it unnecessarily > complex for the filesystem in case it decides it doesn't want to perform AIO > and wants to fall back to good old synchronous IO. In such case it is much > easier for the filesystem if it just gets normal error return from > iomap_dio_rw() and not just -EIOCBQUEUED. Yeah, that'd be nice. :) > The first patch in the series adds argument to iomap_dio_rw() to wait for IO > completion (internally iomap_dio_rw() already supports this!) and the second > patch converts XFS waiting for unaligned DIO write to this new API. > > What do people think? I've just caught up on the ext4 iomap dio thread where this came up, so I have some idea of what is going on now :) My main issue is that I don't like the idea of a "force_wait" parameter to iomap_dio_rw() that overrides what the kiocb says to do inside iomap_dio_rw(). It just seems ... clunky. I'd much prefer that the entire sync/async IO decision is done in one spot, and the result of that is passed into iomap_dio_rw(). i.e. the caller always determines the behaviour. That would mean the callers need to do something like this by default: ret = iomap_dio_rw(iocb, iter, ops, dops, is_sync_kiocb(iocb)); And filesystems like XFS will need to do: ret = iomap_dio_rw(iocb, iter, ops, dops, is_sync_kiocb(iocb) || unaligned); and ext4 will calculate the parameter in whatever way it needs to. In fact, it may be that a wrapper function is better for existing callers: static inline ssize_t iomap_dio_rw() { return iomap_dio_rw_wait(iocb, iter, ops, dops, is_sync_kiocb(iocb)); } And XFS/ext4 writes call iomap_dio_rw_wait() directly. That way we don't need to change the read code at all... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx