Re: [PATCH 24/63] xfs: when replaying bmap operations, don't let unlinked inodes get reaped

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 05, 2016 at 06:07:27AM +1100, Dave Chinner wrote:
> On Tue, Oct 04, 2016 at 08:44:01AM -0400, Brian Foster wrote:
> > On Mon, Oct 03, 2016 at 05:29:25PM -0700, Darrick J. Wong wrote:
> > > On Mon, Oct 03, 2016 at 03:04:10PM -0400, Brian Foster wrote:
> > > > On Thu, Sep 29, 2016 at 08:08:17PM -0700, Darrick J. Wong wrote:
> > > > > Log recovery will iget an inode to replay BUI items and iput the inode
> > > > > when it's done.  Unfortunately, the iput will see that i_nlink == 0
> > > > > and decide to truncate & free the inode, which prevents us from
> > > > > replaying subsequent BUIs.  We can't skip the BUIs because we have to
> > > > > replay all the redo items to ensure that atomic operations complete.
> > > > > 
> > ...
> > > > 
> > > > > Since unlinked inode recovery will reap the inode anyway, we can
> > > > > safely introduce a new inode flag to indicate that an inode is in this
> > > > > 'unlinked recovery' state and should not be auto-reaped in the
> > > > > drop_inode path.
> > > > > 
> > > > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > > > > ---
> > > > >  fs/xfs/xfs_bmap_item.c   |    1 +
> > > > >  fs/xfs/xfs_inode.c       |    8 ++++++++
> > > > >  fs/xfs/xfs_inode.h       |    6 ++++++
> > > > >  fs/xfs/xfs_log_recover.c |    1 +
> > > > >  4 files changed, 16 insertions(+)
> > > > > 
> > > > > 
> > ...
> > > > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > > > > index e08eaea..0c25a76 100644
> > > > > --- a/fs/xfs/xfs_inode.c
> > > > > +++ b/fs/xfs/xfs_inode.c
> > > > > @@ -1855,6 +1855,14 @@ xfs_inactive(
> > > > >  	if (mp->m_flags & XFS_MOUNT_RDONLY)
> > > > >  		return;
> > > > >  
> > > > > +	/*
> > > > > +	 * If this unlinked inode is in the middle of recovery, don't
> > > > > +	 * truncate and free the inode just yet; log recovery will take
> > > > > +	 * care of that.  See the comment for this inode flag.
> > > > > +	 */
> > > > > +	if (xfs_iflags_test(ip, XFS_IRECOVER_UNLINKED))
> > > > > +		return;
> > > > > +
> > > > 
> > > > Also, it might be better to push this one block of code down since the
> > > > following block still deals with i_nlink > 0 properly (not that it will
> > > > likely affect the code as it is now, since we only handle eofblocks
> > > > trimming atm).
> > > 
> > > I put the jump-out case there so that we touch the inode's bmap as little
> > > as possible while we're recovering the inode.  Since the inode is still
> > > around in memory, so we'll end up back there at a later point anyway.
> > > 
> > 
> > I'm not quite following... it looks like we set the reclaim tag on the
> > inode unconditionally after we get through xfs_inactive(). That implies
> > the in-memory inode can go away at any point thereafter, unless somebody
> > else comes along and happens to look for it. Hmm?
> 
> Yup - the iunlink recover check needs to go into xfs_fs_drop_inode()
> to determine whether the inode should be dropped from the cache or
> not by iput_final(). That way it will never get near xfs_inactive()
> because the VFS won't try to evict it until the
> XFS_IRECOVER_UNLINKED flag is cleared.

(Ick, the iput code...)

So.... paging some of my notes back into memory, iput_final() will
still evict() an i_count == 0 inode even if op->drop_inode says
not to drop the inode if MS_ACTIVE is not set on the superblock:

iput_final() {
  if (op->drop_inode)
  	drop = op->drop_inode(inode);
  else
  	drop = generic_drop_inode(inode);

  if (!drop && (sb->s_flags & MS_ACTIVE)) {
  	inode->i_state |= I_REFERENCED;
  	inode_add_lru(inode);
  	spin_unlock(&inode->i_lock);
  	return;
  }

  /* do stuff */

  evict(inode);
}

MS_ACTIVE isn't set on the superblock during recovery because the
VFS doesn't set it until fill_super succeeds, and fill_super doesn't
return until we're done with log recovery.  Therefore, we can end up
in xfs_inactive during recovery even if _drop_inode just told the VFS
not to evict the inode.

IIRC that's why the IRECOVERY check ended up in xfs_inactive.  I
don't mind adding a second IRECOVERY check to xfs_fs_drop_inode,
but removing the one in xfs_inactive breaks recovery (xfs/329).

(Or, per a suggestion of Dave, I could just set MS_ACTIVE prior to
second stage log recovery.)

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux