On Wed, Aug 28, 2019 at 10:26:19PM +0200, Jan Kara wrote: > On Mon 12-08-19 22:53:26, Matthew Bobrowski wrote: > > This patch introduces a new direct IO write code path implementation > > that makes use of the iomap infrastructure. > > > > All direct IO write operations are now passed from the ->write_iter() callback > > to the new function ext4_dio_write_iter(). This function is responsible for > > calling into iomap infrastructure via iomap_dio_rw(). Snippets of the direct > > IO code from within ext4_file_write_iter(), such as checking whether the IO > > request is unaligned asynchronous IO, or whether it will ber overwriting > > allocated and initialized blocks has been moved out and into > > ext4_dio_write_iter(). > > > > The block mapping flags that are passed to ext4_map_blocks() from within > > ext4_dio_get_block() and friends have effectively been taken out and > > introduced within the ext4_iomap_begin(). If ext4_map_blocks() happens to have > > instantiated blocks beyond the i_size, then we attempt to place the inode onto > > the orphan list. Despite being able to perform i_size extension checking > > earlier on in the direct IO code path, it makes most sense to perform this bit > > post successful block allocation. > > > > The ->end_io() callback ext4_dio_write_end_io() is responsible for removing > > the inode from the orphan list and determining if we should truncate a failed > > write in the case of an error. We also convert a range of unwritten extents to > > written if IOMAP_DIO_UNWRITTEN is set and perform the necessary > > i_size/i_disksize extension if the iocb->ki_pos + dio->size > i_size_read(inode). > > > > In the instance of a short write, we fallback to buffered IO and complete > > whatever is left the 'iter'. Any blocks that may have been allocated in > > preparation for direct IO will be reused by buffered IO, so there's no issue > > with leaving allocated blocks beyond EOF. > > > > Signed-off-by: Matthew Bobrowski <mbobrowski@xxxxxxxxxxxxxx> > > --- > > fs/ext4/file.c | 227 ++++++++++++++++++++++++++++++++++++++++---------------- > > fs/ext4/inode.c | 42 +++++++++-- > > 2 files changed, 199 insertions(+), 70 deletions(-) > > Overall this is very nice. Some smaller comments below. > > > @@ -235,6 +244,34 @@ static ssize_t ext4_write_checks(struct kiocb *iocb, struct iov_iter *from) > > return iov_iter_count(from); > > } > > > > +static ssize_t ext4_buffered_write_iter(struct kiocb *iocb, > > + struct iov_iter *from) > > +{ > > + ssize_t ret; > > + struct inode *inode = file_inode(iocb->ki_filp); > > + > > + if (!inode_trylock(inode)) { > > + if (iocb->ki_flags & IOCB_NOWAIT) > > + return -EOPNOTSUPP; > > + inode_lock(inode); > > + } > > Currently there's no support for IOCB_NOWAIT for buffered IO so you can > replace this with "inode_lock(inode)". IOCB_NOWAIT is supported for buffered reads. It is not supported on buffered writes (as yet), so this should return EOPNOTSUPP if IOCB_NOWAIT is set, regardless of whether the lock can be grabbed or not. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx