Re: [Bug report][fstests generic/047] Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c. Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Tue, Nov 07, 2023 at 03:26:27AM +0800, Zorro Lang wrote:
> On Mon, Nov 06, 2023 at 05:13:30PM +1100, Dave Chinner wrote:
> > On Sun, Oct 29, 2023 at 12:11:22PM +0800, Zorro Lang wrote:
> > > Hi xfs list,
> > > 
> > > Recently I always hit xfs corruption by running fstests generic/047 [1], and
> > > it show more failures in dmesg[2], e.g:
> > 
> > OK, g/047 is an fsync test.
> > 
> > > 
> > >   XFS (loop1): Internal error !(flags & XFS_DABUF_MAP_HOLE_OK) at line 2572 of file fs/xfs/libxfs/xfs_da_btree.c.  Caller xfs_dabuf_map.constprop.0+0x26c/0x368 [xfs]
> > 
> > Ok, a directory block index translated to a hole in the file
> > mapping. That's bad...
....
> > > _check_xfs_filesystem: filesystem on /dev/loop1 is inconsistent (r)
> > > *** xfs_repair -n output ***
> > > Phase 1 - find and verify superblock...
> > > Phase 2 - using internal log
> > >         - zero log...
> > >         - scan filesystem freespace and inode maps...
> > >         - found root inode chunk
> > > Phase 3 - for each AG...
> > >         - scan (but don't clear) agi unlinked lists...
> > >         - process known inodes and perform inode discovery...
> > >         - agno = 0
> > > bad nblocks 9 for inode 128, would reset to 0
> > > no . entry for directory 128
> > > no .. entry for root directory 128
> > > problem with directory contents in inode 128
> > > would clear root inode 128
> > > bad nblocks 8 for inode 131, would reset to 0
> > > bad nblocks 8 for inode 132, would reset to 0
> > > bad nblocks 8 for inode 133, would reset to 0
> > > ...
> > > bad nblocks 8 for inode 62438, would reset to 0
> > > bad nblocks 8 for inode 62439, would reset to 0
> > > bad nblocks 8 for inode 62440, would reset to 0
> > > bad nblocks 8 for inode 62441, would reset to 0
> > 
> > Yet all the files - including the data files that were fsync'd - are
> > all bad.
> > 
> > Aparently the journal has been recovered, but lots of metadata
> > updates that should have been in the journal are missing after
> > recovery has completed? That doesn't make a whole lot of sense -
> > when did these tests start failing? Can you run a bisect?
> 
> Hi Dave,
> 
> Thanks for your reply :) I tried to do a kernel bisect long time, but
> find nothing ... Then suddently, I found it's failed from a xfsprogs
> change [1].
> 
> Although that's not the root cause of this bug (on s390x), it just
> enabled "nrext64" by default, which I never tested on s390x before.
> For now, we know this's an issue about this feature, and only on
> s390x for now.

That's not good. Can you please determine if this is a zero-day bug
with the nrext64 feature? I think it was merged in 5.19, so if you
could try to reproduce it on a 5.18 and 5.19 kernels first, that
would be handy.

Also, from your s390 kernel build, can you get the pahole output
for the struct xfs_dinode both for a good kernel and a bad kernel?

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx




[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux