Re: moar weird metadata corruptions, this time on arm64

Dave Chinner <david@xxxxxxxxxxxxx> · Tue, 22 Nov 2022 12:58:06 +1100

On Mon, Nov 21, 2022 at 04:16:02PM -0800, Darrick J. Wong wrote:
> Hi all,
> 
> I've been running near-continuous integration testing of online fsck,
> and I've noticed that once a day, one of the ARM VMs will fail the test
> with out of order records in the data fork.
> 
> xfs/804 races fsstress with online scrub (aka scan but do not change
> anything), so I think this might be a bug in the core xfs code.  This
> also only seems to trigger if one runs the test for more than ~6 minutes
> via TIME_FACTOR=13 or something.
> https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/tree/tests/xfs/804?h=djwong-wtf
> 
> I added a debugging patch to the kernel to check the data fork extents
> after taking the ILOCK, before dropping ILOCK, and before and after each
> bmapping operation.  So far I've narrowed it down to the delalloc code
> inserting a record in the wrong place in the iext tree:
> 
> xfs_bmap_add_extent_hole_delay, near line 2691:
> 
> 	case 0:
> 		/*
> 		 * New allocation is not contiguous with another
> 		 * delayed allocation.
> 		 * Insert a new entry.
> 		 */
> 		oldlen = newlen = 0;
> 		xfs_iunlock_check_datafork(ip);		<-- ok here
> 		xfs_iext_insert(ip, icur, new, state);
> 		xfs_iunlock_check_datafork(ip);		<-- bad here
> 		break;
.....
> XFS (sda3): ino 0x6095c72 nr 0x4 offset 0x6a nextoff 0x85
> XFS: Assertion failed: got.br_startoff >= nextoff, file: fs/xfs/xfs_inode.c, line: 136
....
> XFS (sda3): ino 0x6095c72 func xfs_bmap_add_extent_hole_delay line 2691 data fork:
> XFS (sda3):    ino 0x6095c72 nr 0x0 nr_real 0x0 offset 0x26 blockcount 0x4 startblock 0xc119c4 state 0
> XFS (sda3):    ino 0x6095c72 nr 0x1 nr_real 0x1 offset 0x2a blockcount 0x26 startblock 0xcc457e state 1
> XFS (sda3):    ino 0x6095c72 nr 0x2 nr_real 0x2 offset 0x58 blockcount 0x12 startblock 0xcc45ac state 1
> XFS (sda3):    ino 0x6095c72 nr 0x3 nr_real 0x3 offset 0x70 blockcount 0x15 startblock 0xffffffffe0007 state 0
> XFS (sda3):    ino 0x6095c72 nr 0x4 nr_real 0x3 offset 0x6a blockcount 0x6 startblock 0xcc45be state 0
> XFS (sda3):    ino 0x6095c72 nr 0x5 nr_real 0x4 offset 0xa7 blockcount 0x19 startblock 0x17ff88 state 0

So icur prior to insertion should point to record 0x5, offset 0xa7
(right).  Prev (left) should point to record 0x4, offset 0x6a.

This makes both left and right valid, and while left is adjacent,
it's a different type so isn't contiguous.

So falling through to "case 0" is correct.

But then it inserts it at index 0x3 before record 0x4, not
at index 0x4 before record 0x5.

>From xfs_iext_insert():

        for (i = nr_entries; i > cur->pos; i--)
		cur->leaf->recs[i] = cur->leaf->recs[i - 1];
	xfs_iext_set(cur_rec(cur), irec);

This implies cur->pos is wrong. i.e. it made a hole cur->pos = 0x3
and inserted there, not at cur->pos = 0x4.

Can you add debug to trace the iext cursor as
xfs_buffered_write_iomap_begin() ->
  xfs_bmapi_reserve_delalloc() ->
    xfs_bmap_add_extent_hole_delay()->
      xfs_iext_insert()

runs? The iext cursor could have been wrong for some time before
this insert tripped over it, so this may just be the messenger that
something has just smashed the stack (icur is a stack variable).

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx