On Thu 29-08-24 19:02:22, Yang Erkun wrote: > From: yangerkun <yangerkun@xxxxxxxxxx> > > Any extending write for ext4 requires the inode to be placed on the > orphan list before the actual write. In addition, the inode can be > actually removed from the orphan list only after all writes are > completed. Otherwise we'd leave allocated blocks beyond i_disksize if we > could not copy all the data into allocated block and e2fsck would > complain. > > Currently, direct IO and buffered IO comply with this logic(buffered > IO will truncate all overflow allocated blocks that has not been > written successfully, and direct IO will truncate all allocated blocks > when error occurs). However, dax write break this since dax write will > remove the inode from the orphan list by calling > ext4_handle_inode_extension unconditionally during extending write. > > We add a argument to help determine does we do a fully write, and for > the case not fully write, we leave the inode on the orphan list, and the > latter ext4_inode_extension_cleanup will help us truncate the overflow > allocated blocks, and then remove the inode from the orphan list. > > Signed-off-by: yangerkun <yangerkun@xxxxxxxxxx> Looks good. Feel free to add: Reviewed-by: Jan Kara <jack@xxxxxxx> Honza > --- > fs/ext4/file.c | 12 ++++++------ > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/fs/ext4/file.c b/fs/ext4/file.c > index be061bb64067..f14aed14b9cf 100644 > --- a/fs/ext4/file.c > +++ b/fs/ext4/file.c > @@ -306,7 +306,7 @@ static ssize_t ext4_buffered_write_iter(struct kiocb *iocb, > } > > static ssize_t ext4_handle_inode_extension(struct inode *inode, loff_t offset, > - ssize_t count) > + ssize_t written, ssize_t count) > { > handle_t *handle; > > @@ -315,7 +315,7 @@ static ssize_t ext4_handle_inode_extension(struct inode *inode, loff_t offset, > if (IS_ERR(handle)) > return PTR_ERR(handle); > > - if (ext4_update_inode_size(inode, offset + count)) { > + if (ext4_update_inode_size(inode, offset + written)) { > int ret = ext4_mark_inode_dirty(handle, inode); > if (unlikely(ret)) { > ext4_journal_stop(handle); > @@ -323,11 +323,11 @@ static ssize_t ext4_handle_inode_extension(struct inode *inode, loff_t offset, > } > } > > - if (inode->i_nlink) > + if ((written == count) && inode->i_nlink) > ext4_orphan_del(handle, inode); > ext4_journal_stop(handle); > > - return count; > + return written; > } > > /* > @@ -393,7 +393,7 @@ static int ext4_dio_write_end_io(struct kiocb *iocb, ssize_t size, > if (pos + size <= READ_ONCE(EXT4_I(inode)->i_disksize) && > pos + size <= i_size_read(inode)) > return size; > - return ext4_handle_inode_extension(inode, pos, size); > + return ext4_handle_inode_extension(inode, pos, size, size); > } > > static const struct iomap_dio_ops ext4_dio_write_ops = { > @@ -669,7 +669,7 @@ ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from) > ret = dax_iomap_rw(iocb, from, &ext4_iomap_ops); > > if (extend) { > - ret = ext4_handle_inode_extension(inode, offset, ret); > + ret = ext4_handle_inode_extension(inode, offset, ret, count); > ext4_inode_extension_cleanup(inode, ret < (ssize_t)count); > } > out: > -- > 2.39.2 > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR