On Tue, May 22, 2018 at 09:43:59AM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > In inode_init_always(), we clear the inode mapping flags, which clears > any retained error (AS_EIO, AS_ENOSPC) bits. Unfortunately, we do not > also clear wb_err, which means that old mapping errors can leak through > to new inodes. > > This is crucial for the XFS inode allocation path because we recycle old > in-core inodes and we do not want error state from an old file to leak > into the new file. This bug was discovered by running generic/036 and > generic/047 in a loop and noticing that the EIOs generated by the > collision of direct and buffered writes in generic/036 would survive the > remount between 036 and 047, and get reported to the fsyncs (on > different files!) in generic/047. > > Since we're changing the semantics of inode_init_always, we must also > change xfs_reinit_inode to retain the writeback error state when we go > to recover an inode that has been torn down in the vfs but not yet > disposed of by XFS. > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx> > --- > v3: clear error state when allocating new inode > v2: retain AS_EIO/AS_ENOSPC across xfs inode reinit > --- > fs/inode.c | 1 + > fs/xfs/xfs_icache.c | 9 +++++++++ > fs/xfs/xfs_inode.c | 5 +++++ > 3 files changed, 15 insertions(+) > > diff --git a/fs/inode.c b/fs/inode.c > index 13ceb98c3bd3..3b55391072f3 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -178,6 +178,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode) > mapping->a_ops = &empty_aops; > mapping->host = inode; > mapping->flags = 0; > + mapping->wb_err = 0; > atomic_set(&mapping->i_mmap_writable, 0); > mapping_set_gfp_mask(mapping, GFP_HIGHUSER_MOVABLE); > mapping->private_data = NULL; > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c > index 164350d91efc..d01f9544ff01 100644 > --- a/fs/xfs/xfs_icache.c > +++ b/fs/xfs/xfs_icache.c > @@ -298,6 +298,10 @@ xfs_reinit_inode( > uint64_t version = inode_peek_iversion(inode); > umode_t mode = inode->i_mode; > dev_t dev = inode->i_rdev; > + errseq_t old_err = inode->i_mapping->wb_err; > + bool as_eio = test_bit(AS_EIO, &inode->i_mapping->flags); > + bool as_enospc = test_bit(AS_ENOSPC, > + &inode->i_mapping->flags); > > error = inode_init_always(mp->m_super, inode); > > @@ -306,6 +310,11 @@ xfs_reinit_inode( > inode_set_iversion_queried(inode, version); > inode->i_mode = mode; > inode->i_rdev = dev; > + inode->i_mapping->wb_err = old_err; > + if (as_eio) > + set_bit(AS_EIO, &inode->i_mapping->flags); > + if (as_enospc) > + set_bit(AS_ENOSPC, &inode->i_mapping->flags); > return error; > } > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c > index 02eae5059231..6c47ea3e577b 100644 > --- a/fs/xfs/xfs_inode.c > +++ b/fs/xfs/xfs_inode.c > @@ -835,6 +835,11 @@ xfs_ialloc( > inode->i_mode |= S_ISGID; > } > > + /* Reset all vfs error state. */ I'd prefer a comment that reminds why we have to do this. E.g., something like: /* * In-core inode reinit carries over vfs error state. Reset it * so we don't leak error state from a previous generation of * the inode. */ ... but feel free to reword that. > + inode->i_mapping->wb_err = 0; > + clear_bit(AS_EIO, &inode->i_mapping->flags); > + clear_bit(AS_ENOSPC, &inode->i_mapping->flags); > + I also wonder whether it might be cleaner to just reset this stuff in xfs_ifree() where we kill the i_mode and bump the gen and whatnot, but afaict this should work too. With the comment fix, at least: Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx> > /* > * If the group ID of the new file does not match the effective group > * ID or one of the supplementary group IDs, the S_ISGID bit is cleared > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html