On Tue, May 21, 2019 at 09:43:58AM +0200, Jan Kara wrote: > ext4_break_layouts() may fail e.g. due to a signal being delivered. > Thus we need to handle its failure gracefully and not by taking the > filesystem down. Currently ext4_break_layouts() failure is rare but it > may become more common once RDMA uses layout leases for handling > long-term page pins for DAX mappings. > > To handle the failure we need to move ext4_break_layouts() earlier > during setattr handling before we do hard to undo changes such as > modifying inode size. To be able to do that we also have to move some > other checks which are better done without holding i_mmap_sem earlier. > > Reported-by: "Weiny, Ira" <ira.weiny@xxxxxxxxx> > Signed-off-by: Jan Kara <jack@xxxxxxx> This fixes the bug I was seeing WRT ext4_break_layouts(). Thanks for the help! One more NIT comment below. > --- > fs/ext4/inode.c | 55 ++++++++++++++++++++++++++++--------------------------- > 1 file changed, 28 insertions(+), 27 deletions(-) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index c7f77c643008..979570b42e18 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -5571,7 +5571,7 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr) > if (attr->ia_valid & ATTR_SIZE) { > handle_t *handle; > loff_t oldsize = inode->i_size; > - int shrink = (attr->ia_size <= inode->i_size); > + int shrink = (attr->ia_size < inode->i_size); > > if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) { > struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); > @@ -5585,18 +5585,35 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr) > if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size) > inode_inc_iversion(inode); > > - if (ext4_should_order_data(inode) && > - (attr->ia_size < inode->i_size)) { > - error = ext4_begin_ordered_truncate(inode, > + if (shrink) { > + if (ext4_should_order_data(inode)) { > + error = ext4_begin_ordered_truncate(inode, > attr->ia_size); > - if (error) > - goto err_out; > + if (error) > + goto err_out; > + } > + /* > + * Blocks are going to be removed from the inode. Wait > + * for dio in flight. > + */ > + inode_dio_wait(inode); > + } else { > + pagecache_isize_extended(inode, oldsize, inode->i_size); > } > + > + down_write(&EXT4_I(inode)->i_mmap_sem); > + > + rc = ext4_break_layouts(inode); > + if (rc) { > + up_write(&EXT4_I(inode)->i_mmap_sem); > + return rc; > + } > + > if (attr->ia_size != inode->i_size) { > handle = ext4_journal_start(inode, EXT4_HT_INODE, 3); > if (IS_ERR(handle)) { > error = PTR_ERR(handle); > - goto err_out; > + goto out_mmap_sem; > } > if (ext4_handle_valid(handle) && shrink) { > error = ext4_orphan_add(handle, inode); > @@ -5627,29 +5644,12 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr) > if (error) { > if (orphan && inode->i_nlink) > ext4_orphan_del(NULL, inode); > - goto err_out; > + goto out_mmap_sem; This goto flows through a second ext4_orphan_del() call which threw me at first. But I think this is ok. Reviewed-by: Ira Weiny <ira.weiny@xxxxxxxxx> And with the series applied. Tested-by: Ira Weiny <ira.weiny@xxxxxxxxx> > } > } > - if (!shrink) { > - pagecache_isize_extended(inode, oldsize, inode->i_size); > - } else { > - /* > - * Blocks are going to be removed from the inode. Wait > - * for dio in flight. > - */ > - inode_dio_wait(inode); > - } > - if (orphan && ext4_should_journal_data(inode)) > - ext4_wait_for_tail_page_commit(inode); > - down_write(&EXT4_I(inode)->i_mmap_sem); > - > - rc = ext4_break_layouts(inode); > - if (rc) { > - up_write(&EXT4_I(inode)->i_mmap_sem); > - error = rc; > - goto err_out; > - } > > + if (shrink && ext4_should_journal_data(inode)) > + ext4_wait_for_tail_page_commit(inode); > /* > * Truncate pagecache after we've waited for commit > * in data=journal mode to make pages freeable. > @@ -5660,6 +5660,7 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr) > if (rc) > error = rc; > } > +out_mmap_sem: > up_write(&EXT4_I(inode)->i_mmap_sem); > } > > -- > 2.16.4 >