Re: rm hanging, v6.1.35

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 11, 2023 at 01:10:11PM +1000, Dave Chinner wrote:
FYI, Chris: google is classifying your email as spam because it
doesn't have any dkim/dmarc authentication on it. You're lucky I
even got this, because I know that gmail mostly just drops such
email in a black hole now....

Dkim hopefully fixed now.

More on the problem topic below, and apologies for the aside, but this is my immediate concern: post the reboot into v5.15.118 I have one of the filesystems failing to mount with:

Jul 11 16:13:10 b2 kernel: XFS (dm-146): Metadata CRC error detected at xfs_allocbt_read_verify+0x12/0xb0 [xfs], xfs_bnobt block 0x10a00f7f8
Jul 11 16:13:10 b2 kernel: XFS (dm-146): Unmount and run xfs_repair
Jul 11 16:13:10 b2 kernel: XFS (dm-146): First 128 bytes of corrupted metadata buffer:
Jul 11 16:13:10 b2 kernel: 00000000: 41 42 33 42 00 00 01 ab 05 0a 78 fe 02 c0 2b ff  AB3B......x...+.
Jul 11 16:13:10 b2 kernel: 00000010: 00 00 00 01 0a 00 f7 f8 00 00 00 26 00 3b 2d 40  ...........&.;-@
Jul 11 16:13:10 b2 kernel: 00000020: cf 42 ed 07 78 a1 4e ff 9e 20 e3 d9 6f fc 3e 30  .B..x.N.. ..o.>0
Jul 11 16:13:10 b2 kernel: 00000030: 00 00 00 02 80 b2 25 41 08 c7 6c 00 00 00 01 28  ......%A..l....(
Jul 11 16:13:10 b2 kernel: 00000040: 08 c7 6d 3e 00 00 00 c2 08 c7 6f 65 00 00 00 a7  ..m>......oe....
Jul 11 16:13:10 b2 kernel: 00000050: 08 c7 70 a5 00 00 00 3a 08 c7 71 00 00 00 00 c4  ..p....:..q.....
Jul 11 16:13:10 b2 kernel: 00000060: 08 c7 71 e8 00 00 00 18 08 c7 72 50 00 00 02 c3  ..q.......rP....
Jul 11 16:13:10 b2 kernel: 00000070: 08 c7 75 b1 00 00 02 4b 08 c7 78 db 00 00 02 3d  ..u....K..x....=
Jul 11 16:13:10 b2 kernel: XFS (dm-146): log mount/recovery failed: error -74
Jul 11 16:13:10 b2 kernel: XFS (dm-146): log mount failed

Then xfs_repair comes back with:
# xfs_repair /dev/vg-name/lv-name
Phase 1 - find and verify superblock...
        - reporting progress in intervals of 15 minutes
Phase 2 - using internal log
        - zero log...
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

At this point is "xfs_repair -L" (and attendant potential data loss) my only option?


On Tue, Jul 11, 2023 at 11:57:16AM +1000, Chris Dunlop wrote:
On Tue, Jul 11, 2023 at 10:13:31AM +1000, Chris Dunlop wrote:
On Tue, Jul 11, 2023 at 07:53:54AM +1000, Chris Dunlop wrote:
Hi,

This box is newly booted into linux v6.1.35 (2 days ago), it was
previously running v5.15.118 without any problems (other than that
fixed by "5e672cd69f0a xfs: non-blocking inodegc pushes", the reason
for the upgrade).

I have rm operations on two files that have been stuck for in excess
of 22 hours and 18 hours respectively:
...

Full sysrq-w output at:

https://file.io/tg7F5OqIWo1B

Ok, you have XFS on dm-writecache on md raid 5 and
everything is stuck in either dm-writecache or md.

Yes, dm-writecache, with the cache on SSD md raid6, in front of the XFS devices on lvm / ceph-rbd.

This task holds the wc_lock() while it is doing this writeback
flush.

All the XFS filesystems are stuck in similar ways, such as trying to
push the journal and getting stuck in IO submission in
dm-writecache:
...
These are getting into dm-writecache, and the lock they are getting
stuck on is the wc_lock().
...

Directory reads are getting stuck in dm-write-cache:
...
Same lock.

File data reads are getting stuck the same way in dm-writecache.

File data writes (i.e. writeback) are getting stuck the same way
in dm-writecache.

Yup, everything is stuck on dm-writecache flushing to the underlying
RAID-5 device.

I don't see anything indicating a filesystem problem here. This
looks like a massively overloaded RAID 5 volume. i.e. the fast
storage that makes up the write cache has filled, and now everything
is stuck waiting for the fast cache to drain to the slow backing
store before new writes can be made to the fast cache. IOWs,
everything is running as RAID5 write speed and there's a huge
backlog of data being written to the RAID 5 volume(s) keeping them
100% busy.

This had never been an issue with v5.15. Is it possible the upgrade to v6.1 had a hand in this or that's probably just coincidence?

In particular, could "5e672cd69f0a xfs: non-blocking inodegc pushes" cause a significantly greater write load on the cache?

I note that there's one fs (separate to the corrupt one above) that's still in mount recovery since the boot some hours ago. On past experience that indicates the inodegc stuff is holding things up, i.e. it would have been running prior to the reboot - or at least trying to.

If there is no IO going to the RAID 5 volumes, then you've got a
RAID 5 or physical IO hang of some kind. But I can't find any
indication of that - everything looks like it's waiting on RAID 5
stripe population to make write progress...

There's been no change to the cache device over the reboot, and it
currently looks busy, but it doesn't look completely swamped:

                    load %user %nice  %sys  %iow  %stl %idle  dev rrqm/s wrqm/s    r/s    w/s    rkB/s    wkB/s  arq-sz  aqu-sz    await   rwait   wwait %util
2023-07-11-16:10:00 16.7   0.6   0.0   1.0  15.0   0.0  82.5 md10    0.0    0.0  172.5 1060.1 17131.66 27450.08   72.34    6.33     5.13    0.56     5.9  48.8
2023-07-11-16:11:00 17.2   0.8   0.0   1.1  15.6   0.0  81.7 md10    0.0    0.0  185.7 1151.4 18202.36 20681.51   58.16    8.18     6.12    0.53     7.0  44.0
2023-07-11-16:12:00 18.9   0.5   0.0   1.0  17.0   0.0  80.6 md10    0.0    0.0  179.9 1139.6 15617.51 26579.51   63.96    7.62     5.78    0.46     6.6  47.5
2023-07-11-16:13:00 18.7   0.6   0.0   1.0  15.4   0.0  82.1 md10    0.0    0.0  161.8 1399.7 15751.70 26273.20   53.83   11.47     7.35    0.51     8.1  47.4
2023-07-11-16:14:00 17.2   0.5   0.0   0.9  13.9   0.0  83.8 md10    0.0    0.0  190.2 1251.1 15207.77 20907.73   50.12   14.19     9.85    2.16    11.0  42.2
2023-07-11-16:15:00 17.0   0.6   0.0   1.0  13.8   0.0  83.7 md10    0.0    0.0  163.5 1571.2 14561.41 24508.45   45.05   14.38     8.29    0.53     9.1  46.7
2023-07-11-16:16:00 18.7   0.8   0.0   1.0  15.5   0.0  81.8 md10    0.0    0.0  164.6 1294.7 13256.49 20737.23   46.59   12.54     8.59    0.50     9.6  42.3
2023-07-11-16:17:00 18.9   0.6   0.0   1.0  14.0   0.0  83.4 md10    0.0    0.0  179.4 1026.7 14079.12 19095.93   55.01    9.12     7.56    0.91     8.7  44.8
2023-07-11-16:18:00 19.4   0.6   0.0   1.0  13.8   0.0  83.7 md10    0.0    0.0  151.6 1315.5 12935.14 23547.30   49.73   11.80     8.04    0.43     8.9  48.5
2023-07-11-16:19:00 19.6   0.6   0.0   1.0  11.9   0.0  85.7 md10    0.0    0.0  168.6 1895.9 14153.63 23810.50   36.78   16.75     8.11    0.77     8.8  48.5

It should be handling about the same load as prior to the reboot.


Cheers,

Chris



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux