Re: [PATCH v2] xfs_db: make sure agblocks is valid to prevent corruption

Dave Chinner <david@xxxxxxxxxxxxx> · Tue, 3 Sep 2024 12:28:55 +1000

On Mon, Sep 02, 2024 at 06:12:39PM +0800, liuhuan01@xxxxxxxxxx wrote:
> From: liuh <liuhuan01@xxxxxxxxxx>
> 
> Recently, I was testing xfstests. When I run xfs/350 case, it always generate coredump during the process.
> 	xfs_db -c "sb 0" -c "p agblocks" /dev/loop1
> 
> System will generate signal SIGFPE corrupt the process. And the stack as follow:
> corrupt at: (*bpp)->b_pag = xfs_perag_get(btp->bt_mount, xfs_daddr_to_agno(btp->bt_mount, blkno)); in function libxfs_getbuf_flags
> 	#0  libxfs_getbuf_flags
> 	#1  libxfs_getbuf_flags
> 	#2  libxfs_buf_read_map
> 	#3  libxfs_buf_read
> 	#4  libxfs_mount
> 	#5  init
> 	#6  main
> 
> The coredump was caused by the corrupt superblock metadata: (mp)->m_sb.sb_agblocks, it was 0.
> In this case, user cannot run in expert mode also.
> 
> So, try to get agblocks from agf/agi 0, if failed use the default geometry to calc agblocks.
> The worst thing is cannot get agblocks accroding above method, then set it to 1.
> 
> Signed-off-by: liuh <liuhuan01@xxxxxxxxxx>
> ---
>  db/Makefile |   2 +-
>  db/init.c   | 128 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 129 insertions(+), 1 deletion(-)
> 
> diff --git a/db/Makefile b/db/Makefile
> index 83389376..322d5617 100644
> --- a/db/Makefile
> +++ b/db/Makefile
> @@ -68,7 +68,7 @@ CFILES = $(HFILES:.h=.c) \
>  LSRCFILES = xfs_admin.sh xfs_ncheck.sh xfs_metadump.sh
>  
>  LLDLIBS	= $(LIBXFS) $(LIBXLOG) $(LIBFROG) $(LIBUUID) $(LIBRT) $(LIBURCU) \
> -	  $(LIBPTHREAD)
> +	  $(LIBPTHREAD) $(LIBBLKID)
>  LTDEPENDENCIES = $(LIBXFS) $(LIBXLOG) $(LIBFROG)
>  LLDFLAGS += -static-libtool-libs
>  
> diff --git a/db/init.c b/db/init.c
> index cea25ae5..15124ee2 100644
> --- a/db/init.c
> +++ b/db/init.c
> @@ -38,6 +38,121 @@ usage(void)
>  	exit(1);
>  }
>  
> +static void
> +xfs_guess_default_ag_geometry(uint64_t *agsize, uint64_t *agcount, struct libxfs_init *x)
> +{
> +	struct fs_topology	ft;
> +	int			blocklog;
> +	uint64_t		dblocks;
> +	int			multidisk;
> +
> +	memset(&ft, 0, sizeof(ft));
> +	get_topology(x, &ft, 1);
> +
> +	/*
> +	 * get geometry from get_topology result.
> +	 * Use default block size (2^12)
> +	 */
> +	blocklog = 12;
> +	multidisk = ft.data.swidth | ft.data.sunit;
> +	dblocks = x->data.size >> (blocklog - BBSHIFT);
> +	calc_default_ag_geometry(blocklog, dblocks, multidisk,
> +				 agsize, agcount);
> +}
> +
> +static xfs_agblock_t
> +xfs_get_agblock_from_agf(struct xfs_mount *mp)
> +{
> +	xfs_agblock_t agblocks = 0;
> +	int error;
> +	struct xfs_buf *bp;
> +	struct xfs_agf *agf;
> +
> +	error = -libxfs_buf_read_uncached(mp->m_ddev_targp,
> +			XFS_AG_DADDR(mp, 0, XFS_AGF_DADDR(mp)),
> +			XFS_FSS_TO_BB(mp, 1), 0, &bp, NULL);

You're not checking what the code you wrote actually does.

#define XFS_AGB_TO_DADDR(mp,agno,agbno) \
        ((xfs_daddr_t)XFS_FSB_TO_BB(mp, \
                (xfs_fsblock_t)(agno) * (mp)->m_sb.sb_agblocks + (agbno)))
#define XFS_AG_DADDR(mp,agno,d)         (XFS_AGB_TO_DADDR(mp, agno, 0) + (d))

Yup, XFS_AG_DADDR() uses the very field we've already decided is
invalid....

In these cases where we specifically want to read from the AG 0
headers and we can't trust anything in the superblock, we need to
open code the daddrs we need to read from.

In this case the daddr of AGF 0 is simply XFS_AGF_DADDR(mp). And we
don't need XFS_FSS_TO_BB(), either - we can simply read a single
sector at that location as the AGF and AGI headers always fit in a
single sector.

Hence this should be:

	error = -libxfs_buf_read_uncached(mp->m_ddev_targp, XFS_AGF_DADDR(mp),
			1, 0, &bp, NULL);

> +	if (error) {
> +		fprintf(stderr, "xfs_get_agblock from agf-0 error %d\n", error);
> +		return agblocks;

Return NULLAGBLOCK on failure.

> +	}
> +
> +	if (xfs_has_crc(mp) && !xfs_buf_verify_cksum(bp, XFS_AGF_CRC_OFF)) {
> +		fprintf(stderr, "xfs_get_agblock from agf-0 badcrc\n");
> +		return agblocks;
> +	}

We can't trust the CRC to discover corruption here - syzbot corrupts
structures and then recalculates the CRC. Hence a good CRC is not an
indication of a valid, uncorrupted structure. And ignoring the CRC
allows a single sector read....

> +	agf = bp->b_addr;
> +	agblocks = be32_to_cpu(agf->agf_length);

This is missing magic number checks to determine if the metadata
returned is actually an AGF block.

> +
> +	libxfs_buf_relse(bp);
> +
> +	return agblocks;
> +}
> +
> +static xfs_agblock_t
> +xfs_get_agblock_from_agi(struct xfs_mount *mp)
> +{
> +	xfs_agblock_t agblocks = 0;
> +	int error;
> +	struct xfs_buf *bp;
> +	struct xfs_agi *agi;
> +
> +	error = -libxfs_buf_read_uncached(mp->m_ddev_targp,
> +			XFS_AG_DADDR(mp, 0, XFS_AGI_DADDR(mp)),
> +			XFS_FSS_TO_BB(mp, 1), 0, &bp, NULL);
> +	if (error) {
> +		fprintf(stderr, "xfs_get_agblock from agi-0 error %d\n", error);
> +		return agblocks;
> +	}
> +
> +	if (xfs_has_crc(mp) && !xfs_buf_verify_cksum(bp, XFS_AGI_CRC_OFF)) {
> +		fprintf(stderr, "xfs_get_agblock from agi-0 badcrc\n");
> +		return agblocks;
> +	}
> +
> +	agi = bp->b_addr;
> +	agblocks = be32_to_cpu(agi->agi_length);
> +
> +	libxfs_buf_relse(bp);
> +
> +	return agblocks;
> +}
> +
> +/*
> + * If sb_agblocks was damaged, try to read it from agf/agi 0.
> + * With read agf/agi fails use default geometry to calc agblocks/agcount.
> + * The worst thing is cannot get agblocks according above method, then set to 1.
> + */
> +static xfs_agblock_t
> +xfs_try_get_agblocks(struct xfs_mount *mp, struct libxfs_init *x)
> +{
> +	xfs_agblock_t agblocks;
> +	uint64_t agsize, agcount;
> +
> +	/* firset try to get agblocks from agf-0 */
> +	agblocks = xfs_get_agblock_from_agf(mp);
> +	if (XFS_FSB_TO_B(mp, agblocks) >= XFS_MIN_AG_BYTES &&

Ah, did you look at what XFS_FSB_TO_B() requires to be set?

#define XFS_FSB_TO_B(mp,fsbno)  ((xfs_fsize_t)(fsbno) << (mp)->m_sb.sb_blocklog)

Essentially, detected that the superblock is damaged by using an
adjacent field in the superblock (which also may be damaged) to do
the validity check isn't reliable here.

We have XFS_MIN_AG_BLOCKS and XFS_MAX_AG_BLOCKS constants defined,
so that is what the validity checks here should use...

> +		XFS_FSB_TO_B(mp, agblocks) <= XFS_MAX_AG_BYTES)
> +		return agblocks;

We shouldn't return agblocks here - we need to validate that it is
the same as the AGI length because we can't trust a single value to
be correct if we have corruption in the headers.

i.e. if both AGI and AGF magic numbers are valid, and the lengths
match, it's a pretty good indication that this is the actual length
of the AG. It just doesn't matter if the AGF/AGI are
otherwise damaged and fail CRCs or pass the CRCs because of
malicious corruption, but if they match we can probably trust it to
be correct for the purposes of diagnosis.

> +	/* second try to get agblocks from agi-0 */
> +	agblocks = xfs_get_agblock_from_agi(mp);
> +	if (XFS_FSB_TO_B(mp, agblocks) >= XFS_MIN_AG_BYTES &&
> +		XFS_FSB_TO_B(mp, agblocks) <= XFS_MAX_AG_BYTES)
> +		return agblocks;
> +
> +	/* third use default geometry to calc agblocks/agcount */
> +	xfs_guess_default_ag_geometry(&agsize, &agcount, x);
> +
> +	if (XFS_FSB_TO_B(mp, agsize) < XFS_MIN_AG_BYTES ||
> +		XFS_FSB_TO_B(mp, agsize) > XFS_MAX_AG_BYTES)
> +		agblocks = 1; /* the worst is set to 1 */

*no*.

agblocks = 1 is *not a valid size* - why set an out-of-bounds value
for something we are already know is out-of-bounds?

Further, xfs_guess_default_ag_geometry() should never return an
invalid AG size unless the block device is under the minimum size
of an XFS filesystem. In which case, there's really nothing we can
do and should actually abort this saying "device too small to hold a
valid XFS filesystem".

> +	else
> +		agblocks = agsize;

The resultant should be probably be compared against the lengths
found from the AGI/AGF checks. If it matches one of them, we have
more confidence that this is the correct length.

> +
> +	return agblocks;
> +}
> +
>  static void
>  init(
>  	int		argc,
> @@ -129,6 +244,19 @@ init(
>  		}
>  	}
>  
> +	/* If sb_agblocks was damaged, try to get agblocks */
> +	if (XFS_FSB_TO_B(&xmount, sbp->sb_agblocks) < XFS_MIN_AG_BYTES ||
> +		XFS_FSB_TO_B(&xmount, sbp->sb_agblocks) > XFS_MAX_AG_BYTES) {
> +		xfs_agblock_t agblocks;
> +		xfs_agblock_t bad_agblocks = sbp->sb_agblocks;
> +
> +		agblocks = xfs_try_get_agblocks(&xmount, &x);
> +		sbp->sb_agblocks = agblocks;
> +
> +		fprintf(stderr, "wrong agblocks, try to get from agblocks %u -> %u\n",
> +			bad_agblocks, sbp->sb_agblocks);

What does this error message mean? If I saw that, I wouldn't have a
clue what it means.

I'd expect output something like this on a AGf/AGI recovery:

Out of bounds superblock agblocks (%u) found.
Attempting recovery from AGF/AGI 0 metadata....
AGF length %u blocks found.
AGI length %u blocks found.
AGF/AGI length matches.
Using %u blocks for superblock agblocks.

Or if AGF or AGI recovery fails:

Out of bounds superblock agblocks (%u) found.
Attempting recovery from AGF/AGI 0 metadata....
AGF length recovery failed.
AGI length %u blocks found.
Attempting to guess AG length from device geometry. This may not work.
Guessed AG length is %u blocks.
Guessed length matches AGI length.
Using %u blocks for superblock agblocks.

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx