Re: [PATCH 2/2] xfs: fix handling of RCU freed inodes from other AGs in xrep_iunlink_mark_incore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 12, 2024 at 07:23:01AM +0200, Christoph Hellwig wrote:
> When xrep_iunlink_mark_incore skips an inode because it was RCU freed
> from another AG, the slot for the inode in the batch array needs to be
> zeroed.  Also un-duplicate the check and remove the need for the
> xrep_iunlink_igrab helper.
> 
> Fixes: ab97f4b1c030 ("xfs: repair AGI unlinked inode bucket lists")
> Signed-off-by: Christoph Hellwig <hch@xxxxxx>
> ---
>  fs/xfs/scrub/agheader_repair.c | 28 +++++++---------------------
>  1 file changed, 7 insertions(+), 21 deletions(-)
> 
> diff --git a/fs/xfs/scrub/agheader_repair.c b/fs/xfs/scrub/agheader_repair.c
> index 2f98d90d7fd66d..558bc86b1b83c3 100644
> --- a/fs/xfs/scrub/agheader_repair.c
> +++ b/fs/xfs/scrub/agheader_repair.c
> @@ -1108,23 +1108,6 @@ xrep_iunlink_walk_ondisk_bucket(
>  	return 0;
>  }
>  
> -/* Decide if this is an unlinked inode in this AG. */
> -STATIC bool
> -xrep_iunlink_igrab(
> -	struct xfs_perag	*pag,
> -	struct xfs_inode	*ip)
> -{
> -	struct xfs_mount	*mp = pag->pag_mount;
> -
> -	if (XFS_INO_TO_AGNO(mp, ip->i_ino) != pag->pag_agno)
> -		return false;
> -
> -	if (!xfs_inode_on_unlinked_list(ip))
> -		return false;
> -
> -	return true;
> -}

This code is wrong. It does not explicitly check for RCU freed
inodes (i.e.  ip->i_ino = 0 or XFS_IRECLAIM being set) and so will
never detect stale RCU freed inodes in AG 0.

It is probably working by chance to avoid stale freed inodes
because ip->i_prev_unlinked will be 0 for such inodes.

*However*, this code does not have the necessary memory barriers to
guarantee it catches the ip->i_ino or ip->i_prev_unlinked writes
prior to freeing. The ip->i_ino check needs to be done under the
ip->i_flags_lock as it is the unlock->lock memory barrier that
the inode cache RCU lookup algorithms rely on for correct detection
for RCU freed inodes.

> -
>  /*
>   * Mark the given inode in the lookup batch in our unlinked inode bitmap, and
>   * remember if this inode is the start of the unlinked chain.
> @@ -1196,9 +1179,6 @@ xrep_iunlink_mark_incore(
>  		for (i = 0; i < nr_found; i++) {
>  			struct xfs_inode *ip = ragi->lookup_batch[i];
>  
> -			if (done || !xrep_iunlink_igrab(pag, ip))
> -				ragi->lookup_batch[i] = NULL;
> -
>  			/*
>  			 * Update the index for the next lookup. Catch
>  			 * overflows into the next AG range which can occur if
> @@ -1211,8 +1191,14 @@ xrep_iunlink_mark_incore(
>  			 * us to see this inode, so another lookup from the
>  			 * same index will not find it again.
>  			 */
> -			if (XFS_INO_TO_AGNO(mp, ip->i_ino) != pag->pag_agno)
> +			if (XFS_INO_TO_AGNO(mp, ip->i_ino) != pag->pag_agno) {
> +				ragi->lookup_batch[i] = NULL;
>  				continue;
> +			}
> +
> +			if (done || !xfs_inode_on_unlinked_list(ip))
> +				ragi->lookup_batch[i] = NULL;

Same with this new code - it's not explicitly checking for RCU freed
inodes and doesn't have the correct memory barriers.

Hence I think the fixes for this code are:

1. change xrep_iunlink_igrab() to use the same RCU freed inode
checks as xfs_blockgc_igrab(); and
2. remove the (XFS_INO_TO_AGNO(mp, ip->i_ino) != pag->pag_agno)
check altogether.

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux