How do I recover from BTRFS issues?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm getting a pretty bad history with BTRFS as the default filesystem for Fedora Workstation.  Its messing up repeatedly and leaving me stuck.  I should note that I have used ext2/3/4 for about 20 years, ZFS on Solaris for even longer, and ZFS on Ubuntu for 2 major releases now.  I have 2 different machines that have had issues so far.

On my Gateway Fedora 33 daily-driver machine running the 5.11 kernels, I had a single Patriot SSD using the default BTRFS partitioning scheme. I kept seeing BTRFS scrub reporting uncorrectable issues and assumed that it was a defective SSD. However, this SSD is now in a Mandriva machine and is solid.  Its not the SSD.  I did about 10 reinstalls after having the machine lock up at random times, and finally trashed the machine in frustration.  I later discovered that it had developed a bad memory stick, which may have contributed to the initial problem cause.  However, the lack of BTRFS robustness, no obvious mechanism to keep /home during a reinstall, and very poor BTRFS documentation have left me wary.

On my current daily-driver machine, I have fully-updated Fedora 34 running the 5.12 kernels on 2 disks set up as as a BTRFS RAID-1 pair.  I expected that would allow for much more robustness than the single disk setup on my F33 machine, giving me error protection similar to what I would have on ZFS.  Unfortunately, that does not appear to be the case.  I have run low-level diagnostics on everything in this machine, and it is working properly.  Unusually, there aren't even any failed lowlevel disk blocks on either drive.  So the hardware on this older enterprse-class Lenovo desktop is not faulty.  I believe that due to faulty BIOS and security chip handling in the 5.12 kernel, I have had issues requiring me to occasionally hard powercycle the machine to get it to actually power down.

One would expect that with BTRFS doing RAID-1, recovery from lockups should never leave the filesystem damaged.  That does not appear to be the case.  Currently the disks have no low-level errors, but BTRFS scrub shows 10 unrecoverable errors.  That's messed up.  Both disks are enterprise-class Seagate Constellation 500GB SATA drives with slightly different model numbers and manufacturing dates, so I don't believe that there is any firmware issue with them.  No matter what, I expect that the initial fsck or brtfs check should keep data integrity, but possibly backing out a few seconds in journal transactions.

I am aware of at least one kernel bug being highly relevant as the initial trigger - bugzilla 195809.  I believe that there are serious bugs in the hardware optimization in Firefox (one bug filed) and in Gnome and more relevant bugs in the kernel, but whatever the triggering issue, the filesystem should never fail.

How do I recover?  The machine is currently bootable and seems to run ok, but locks up once in a while on powerdown and on exiting firefox.  I cannot describe it as stable with this BTRFS issue.  A scrub currently says that / (and therefore also /home) has 10 unrecoverable errors.  I can find no Fedora or Suse documentation on how to recover from what should be impossible situations like this.  A reinstall will not preserve /home, leading to unacceptable data loss.  I did an offline btrfs check on my F33 machine that left the machine unbootable, so its probably not an option either.  I'm stuck at this point.  Should I just stop using the default BTRFS filesystem and go back to ext4?

Help appreciated!

--

John Mellor

_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure



[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [EPEL Devel]     [Fedora Magazine]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Desktop]     [Fedora Fonts]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Fedora Sparc]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux