Re: diagnosing XFS corruption after upgrading to Fedora 36

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 18, 2022 at 12:10 AM Patrick Hemmer <fedora@xxxxxxxxxxxxxxx> wrote:
Ever since upgrading to Fedora 36, my root filesystem is getting corrupted every few hours. I maintain block level backups, and I have to restore every time this happens. xfs_repair can fix the filesystem, but the system is typically unusable as there's often over 10k files in lost+found.

I have tried creating a brand new filesystem (mkfs.xfs), but it still gets corrupted.

I would file a bug, but the caveat is that I also have LVM underneath the filesystem. And so I don't know whether it's a problem with XFS, or LVM. I have other XFS filesystems also on LVM, and have seen corruption on them as well, but it's nowhere near as significant or frequent as on the root filesystem.

Sometimes I can detect the corruption before the kernel does, by doing a snapshot, and running `xfs_repair -n` on the snapshot. And sometimes the kernel will detect the corruption first, usually with a message like:

Jul 17 15:06:52 whistler kernel: XFS (dm-0): Metadata corruption detected at xfs_buf_ioend+0x14c/0x5d0 [xfs], xfs_inode block 0x46057c8 xfs_inode_buf_verify
Jul 17 15:06:52 whistler kernel: XFS (dm-0): Unmount and run xfs_repair
Jul 17 15:06:52 whistler kernel: XFS (dm-0): First 128 bytes of corrupted metadata buffer:
Jul 17 15:06:52 whistler kernel: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jul 17 15:06:52 whistler kernel: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jul 17 15:06:52 whistler kernel: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jul 17 15:06:52 whistler kernel: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jul 17 15:06:52 whistler kernel: 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jul 17 15:06:52 whistler kernel: 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jul 17 15:06:52 whistler kernel: 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jul 17 15:06:52 whistler kernel: 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Jul 17 15:06:52 whistler kernel: XFS (dm-0): metadata I/O error in "xfs_imap_to_bp+0x40/0x50 [xfs]" at daddr 0x46057c8 len 32 error 117
Jul 17 15:06:52 whistler kernel: XFS (dm-0): Metadata I/O Error (0x1) detected at xfs_trans_read_buf_map+0x179/0x2d0 [xfs] (fs/xfs/xfs_trans_buf.c:296).  Shutting down filesystem.
Jul 17 15:06:52 whistler kernel: XFS (dm-0): Please unmount the filesystem and rectify the problem(s)

So how can I proceed on this? Is there any way to determine whether this is an LVM issue or an XFS issue?

LVM and XFS on linux have been very reliable, so you need to rule out hardware problems.   If the drive supports
S.M.A.R.T then smartmontools can run the internal tests.  Some vendors provide test software (often
Windows only).  Cables and connectors should also be considered.  Try swapping cables and connections.
"Contact enhancer" sometimes solves connection problems (now that cars are full of computers, you can buy
contact enhancer at auto supply stores). 

It is very useful to have an external drive to USB adapter.  For nvme, a USB-C nvme case provides a way to
test nvme drives, and a cast-off 128G nvme card can be used in the adapter as a fast alternative to USB memory
"keys".


 
_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure


--
George N. White III

_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [EPEL Devel]     [Fedora Magazine]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Desktop]     [Fedora Fonts]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Fedora Sparc]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux