From: Darrick J. Wong <darrick.wong@xxxxxxxxxx> In inode_init_always(), we clear the inode mapping flags, which clears any retained error (AS_EIO, AS_ENOSPC) bits. Unfortunately, we do not also clear wb_err, which means that old mapping errors can leak through to new inodes. This is crucial for the XFS inode allocation path because we recycle old in-core inodes and we do not want error state from an old file to leak into the new file. This bug was discovered by running generic/036 and generic/047 in a loop and noticing that the EIOs generated by the collision of direct and buffered writes in generic/036 would survive the remount between 036 and 047, and get reported to the fsyncs (on different files!) in generic/047. Since we're changing the semantics of inode_init_always, we must also change xfs_reinit_inode to retain the writeback error state when we go to recover an inode that has been torn down in the vfs but not yet disposed of by XFS. Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx> --- v3: clear error state when allocating new inode v2: retain AS_EIO/AS_ENOSPC across xfs inode reinit --- fs/inode.c | 1 + fs/xfs/xfs_icache.c | 9 +++++++++ fs/xfs/xfs_inode.c | 5 +++++ 3 files changed, 15 insertions(+) diff --git a/fs/inode.c b/fs/inode.c index 13ceb98c3bd3..3b55391072f3 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -178,6 +178,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode) mapping->a_ops = &empty_aops; mapping->host = inode; mapping->flags = 0; + mapping->wb_err = 0; atomic_set(&mapping->i_mmap_writable, 0); mapping_set_gfp_mask(mapping, GFP_HIGHUSER_MOVABLE); mapping->private_data = NULL; diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index 164350d91efc..d01f9544ff01 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -298,6 +298,10 @@ xfs_reinit_inode( uint64_t version = inode_peek_iversion(inode); umode_t mode = inode->i_mode; dev_t dev = inode->i_rdev; + errseq_t old_err = inode->i_mapping->wb_err; + bool as_eio = test_bit(AS_EIO, &inode->i_mapping->flags); + bool as_enospc = test_bit(AS_ENOSPC, + &inode->i_mapping->flags); error = inode_init_always(mp->m_super, inode); @@ -306,6 +310,11 @@ xfs_reinit_inode( inode_set_iversion_queried(inode, version); inode->i_mode = mode; inode->i_rdev = dev; + inode->i_mapping->wb_err = old_err; + if (as_eio) + set_bit(AS_EIO, &inode->i_mapping->flags); + if (as_enospc) + set_bit(AS_ENOSPC, &inode->i_mapping->flags); return error; } diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 02eae5059231..6c47ea3e577b 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -835,6 +835,11 @@ xfs_ialloc( inode->i_mode |= S_ISGID; } + /* Reset all vfs error state. */ + inode->i_mapping->wb_err = 0; + clear_bit(AS_EIO, &inode->i_mapping->flags); + clear_bit(AS_ENOSPC, &inode->i_mapping->flags); + /* * If the group ID of the new file does not match the effective group * ID or one of the supplementary group IDs, the S_ISGID bit is cleared