Re: [PATCH] xfs: fix perag structure refcounting error when scrub fails

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 20 Aug 2021 at 10:36, Darrick J. Wong wrote:
> The kernel test robot found the following bug when running xfs/355 to
> scrub a bmap btree:
>
> XFS: Assertion failed: !sa->pag, file: fs/xfs/scrub/common.c, line: 412
> ------------[ cut here ]------------
> kernel BUG at fs/xfs/xfs_message.c:110!
> invalid opcode: 0000 [#1] SMP PTI
> CPU: 2 PID: 1415 Comm: xfs_scrub Not tainted 5.14.0-rc4-00021-g48c6615cc557 #1
> Hardware name: Hewlett-Packard p6-1451cx/2ADA, BIOS 8.15 02/05/2013
> RIP: 0010:assfail+0x23/0x28 [xfs]
> RSP: 0018:ffffc9000aacb890 EFLAGS: 00010202
> RAX: 0000000000000000 RBX: ffffc9000aacbcc8 RCX: 0000000000000000
> RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffffc09e7dcd
> RBP: ffffc9000aacbc80 R08: ffff8881fdf17d50 R09: 0000000000000000
> R10: 000000000000000a R11: f000000000000000 R12: 0000000000000000
> R13: ffff88820c7ed000 R14: 0000000000000001 R15: ffffc9000aacb980
> FS:  00007f185b955700(0000) GS:ffff8881fdf00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f7f6ef43000 CR3: 000000020de38002 CR4: 00000000001706e0
> Call Trace:
>  xchk_ag_read_headers+0xda/0x100 [xfs]
>  xchk_ag_init+0x15/0x40 [xfs]
>  xchk_btree_check_block_owner+0x76/0x180 [xfs]
>  xchk_btree_get_block+0xd0/0x140 [xfs]
>  xchk_btree+0x32e/0x440 [xfs]
>  xchk_bmap_btree+0xd4/0x140 [xfs]
>  xchk_bmap+0x1eb/0x3c0 [xfs]
>  xfs_scrub_metadata+0x227/0x4c0 [xfs]
>  xfs_ioc_scrub_metadata+0x50/0xc0 [xfs]
>  xfs_file_ioctl+0x90c/0xc40 [xfs]
>  __x64_sys_ioctl+0x83/0xc0
>  do_syscall_64+0x3b/0xc0
>
> The unusual handling of errors while initializing struct xchk_ag is the
> root cause here.  Since the beginning of xfs_scrub, the goal of
> xchk_ag_read_headers has been to read all three AG header buffers and
> attach them both to the xchk_ag structure and the scrub transaction.
> Corruption errors on any of the three headers doesn't necessarily
> trigger an immediate return to userspace, because xfs_scrub can also
> tell us to /fix/ the problem.
>
> In other words, it's possible for the xchk_ag init functions to return
> an error code and a partially filled out structure so that scrub can use
> however much information it managed to pull.  Before 5.15, it was
> sufficient to cancel (or commit) the scrub transaction on the way out of
> the scrub code to release the buffers.
>
> Ccommit 48c6615cc557 added a reference to the perag structure to struct
> xchk_ag.  Since perag structures are not attached to transactions like
> buffers are, this adds the requirement that the perag ref be released
> explicitly.  The scrub teardown function xchk_teardown was amended to do
> this for the xchk_ag embedded in struct xfs_scrub.
>
> Unfortunately, I forgot that certain parts of the scrub code probe
> multiple AGs and therefore handle the initialization and cleanup on
> their own.  Specifically, the bmbt scrubber will initialize it long
> enough to cross-reference AG metadata for btree blocks and for the
> extent mappings in the bmbt.
>
> If one of the AG headers is corrupt, the init function returns with a
> live perag structure reference and some of the AG header buffers.  If an
> error occurs, the cross referencing will be noted as XCORRUPTion and
> skipped, but the main scrub process will move on to the next record.
> It is now necessary to release the perag reference before we try to
> analyze something from a different AG, or else we'll trip over the
> assertion noted above.

Looks good to me.

Reviewed-by: Chandan Babu R <chandanrlinux@xxxxxxxxx>

>
> Fixes: 48c6615cc557 ("xfs: grab active perag ref when reading AG headers")
> Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
> Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx>
> ---
>  fs/xfs/scrub/bmap.c       |    3 ++-
>  fs/xfs/scrub/btree.c      |    3 ++-
>  fs/xfs/scrub/fscounters.c |    2 +-
>  fs/xfs/scrub/inode.c      |    3 ++-
>  4 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c
> index 2df5e5a51cbd..017da9ceaee9 100644
> --- a/fs/xfs/scrub/bmap.c
> +++ b/fs/xfs/scrub/bmap.c
> @@ -263,7 +263,7 @@ xchk_bmap_iextent_xref(
>  	error = xchk_ag_init_existing(info->sc, agno, &info->sc->sa);
>  	if (!xchk_fblock_process_error(info->sc, info->whichfork,
>  			irec->br_startoff, &error))
> -		return;
> +		goto out_free;
>  
>  	xchk_xref_is_used_space(info->sc, agbno, len);
>  	xchk_xref_is_not_inode_chunk(info->sc, agbno, len);
> @@ -283,6 +283,7 @@ xchk_bmap_iextent_xref(
>  		break;
>  	}
>  
> +out_free:
>  	xchk_ag_free(info->sc, &info->sc->sa);
>  }
>  
> diff --git a/fs/xfs/scrub/btree.c b/fs/xfs/scrub/btree.c
> index fd832f103fa4..eccb855dc904 100644
> --- a/fs/xfs/scrub/btree.c
> +++ b/fs/xfs/scrub/btree.c
> @@ -377,7 +377,7 @@ xchk_btree_check_block_owner(
>  		error = xchk_ag_init_existing(bs->sc, agno, &bs->sc->sa);
>  		if (!xchk_btree_xref_process_error(bs->sc, bs->cur,
>  				level, &error))
> -			return error;
> +			goto out_free;
>  	}
>  
>  	xchk_xref_is_used_space(bs->sc, agbno, 1);
> @@ -393,6 +393,7 @@ xchk_btree_check_block_owner(
>  	if (!bs->sc->sa.rmap_cur && btnum == XFS_BTNUM_RMAP)
>  		bs->cur = NULL;
>  
> +out_free:
>  	if (init_sa)
>  		xchk_ag_free(bs->sc, &bs->sc->sa);
>  
> diff --git a/fs/xfs/scrub/fscounters.c b/fs/xfs/scrub/fscounters.c
> index 737aa5b39d5e..48a6cbdf95d0 100644
> --- a/fs/xfs/scrub/fscounters.c
> +++ b/fs/xfs/scrub/fscounters.c
> @@ -150,7 +150,7 @@ xchk_fscount_btreeblks(
>  
>  	error = xchk_ag_init_existing(sc, agno, &sc->sa);
>  	if (error)
> -		return error;
> +		goto out_free;
>  
>  	error = xfs_btree_count_blocks(sc->sa.bno_cur, &blocks);
>  	if (error)
> diff --git a/fs/xfs/scrub/inode.c b/fs/xfs/scrub/inode.c
> index d6e0e3a11fbc..2405b09d03d0 100644
> --- a/fs/xfs/scrub/inode.c
> +++ b/fs/xfs/scrub/inode.c
> @@ -533,7 +533,7 @@ xchk_inode_xref(
>  
>  	error = xchk_ag_init_existing(sc, agno, &sc->sa);
>  	if (!xchk_xref_process_error(sc, agno, agbno, &error))
> -		return;
> +		goto out_free;
>  
>  	xchk_xref_is_used_space(sc, agbno, 1);
>  	xchk_inode_xref_finobt(sc, ino);
> @@ -541,6 +541,7 @@ xchk_inode_xref(
>  	xchk_xref_is_not_shared(sc, agbno, 1);
>  	xchk_inode_xref_bmap(sc, dip);
>  
> +out_free:
>  	xchk_ag_free(sc, &sc->sa);
>  }
>  


-- 
chandan



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux