On Mon, May 31, 2021 at 8:28 PM John Mellor <john.mellor@xxxxxxxxx> wrote: > > I'm getting a pretty bad history with BTRFS as the default filesystem > for Fedora Workstation. Its messing up repeatedly and leaving me > stuck. I should note that I have used ext2/3/4 for about 20 years, ZFS > on Solaris for even longer, and ZFS on Ubuntu for 2 major releases now. > I have 2 different machines that have had issues so far. > > On my Gateway Fedora 33 daily-driver machine running the 5.11 kernels, I > had a single Patriot SSD using the default BTRFS partitioning scheme. I > kept seeing BTRFS scrub reporting uncorrectable issues and assumed that > it was a defective SSD. However, this SSD is now in a Mandriva machine > and is solid. Its not the SSD. I did about 10 reinstalls after having > the machine lock up at random times, and finally trashed the machine in > frustration. I later discovered that it had developed a bad memory > stick, which may have contributed to the initial problem cause. > However, the lack of BTRFS robustness, no obvious mechanism to keep > /home during a reinstall, and very poor BTRFS documentation have left me > wary. > > On my current daily-driver machine, I have fully-updated Fedora 34 > running the 5.12 kernels on 2 disks set up as as a BTRFS RAID-1 pair. I > expected that would allow for much more robustness than the single disk > setup on my F33 machine, giving me error protection similar to what I > would have on ZFS. Unfortunately, that does not appear to be the case. > I have run low-level diagnostics on everything in this machine, and it > is working properly. Unusually, there aren't even any failed lowlevel > disk blocks on either drive. So the hardware on this older > enterprse-class Lenovo desktop is not faulty. I believe that due to > faulty BIOS and security chip handling in the 5.12 kernel, I have had > issues requiring me to occasionally hard powercycle the machine to get > it to actually power down. > > One would expect that with BTRFS doing RAID-1, recovery from lockups > should never leave the filesystem damaged. That does not appear to be > the case. Currently the disks have no low-level errors, but BTRFS scrub > shows 10 unrecoverable errors. That's messed up. Both disks are > enterprise-class Seagate Constellation 500GB SATA drives with slightly > different model numbers and manufacturing dates, so I don't believe that > there is any firmware issue with them. Any problem really needs kernel messages to have any idea what's going on; and often it requires the entire dmesg because there is an underlying problem. >No matter what, I expect that > the initial fsck or brtfs check should keep data integrity, but possibly > backing out a few seconds in journal transactions. btrfs check by default is --readonly and makes no changes at all to the file system. --repair is expected to fix inconsistency or fail and do nothing. It won't drop transactions. The normal write order for btrfs is: data->metadata->superblock Between metadata and superblock writes; and after the superblock write, there's a FLUSH/FUA that tells the drive to ensure exactly that write ordering. That means it doesn't really matter if there's reordering of data and metadata writes as long as all data and metadata is flushed to stable media before the super block is written. Because of COW, it means the on-disk superblock is only ever pointing to valid trees. And in case of crash, you might see some writes go missing. But if write ordering is not honored by the drive, it's a big problem for any file system. Btrfs is definitely more difficult to repair because the file system metadata isn't in any specific location, so no assumptions can be made about what "should" be in a particular location. > > I am aware of at least one kernel bug being highly relevant as the > initial trigger - bugzilla 195809. Maybe this one? https://bugzilla.redhat.com/show_bug.cgi?id=1965809 The file system should be OK if using write through mode. If using write back mode, all bets are off. Whether crash, power fail, or flash (cache) device failure, it's expected severe data loss is a strong possibility which is why safeguards have to be taken to use writeback mode. > I believe that there are serious > bugs in the hardware optimization in Firefox (one bug filed) and in > Gnome and more relevant bugs in the kernel, but whatever the triggering > issue, the filesystem should never fail. Well the file system is on a storage stack of a lot of other software and hardware. It's kinda hard to know what's going on without details of that storage stack, as well as dmesg, as well as output from 'btrfs check --readonly' > > How do I recover? The machine is currently bootable and seems to run > ok, but locks up once in a while on powerdown and on exiting firefox. I > cannot describe it as stable with this BTRFS issue. A scrub currently > says that / (and therefore also /home) has 10 unrecoverable errors. I > can find no Fedora or Suse documentation on how to recover from what > should be impossible situations like this. It's not supposed to happen. But once it happens, it's very case specific and a bit complicated to figure out what probably happened and what the next steps are. It's good at avoiding trouble in the first place due to COW, i.e. nothing is being overwritten, therefore interruptions during writes, whether crash or power fail, aren't a problem. But write ordering violations can result in more problems with Btrfs. There are some safeguards built in to work around that, but they are limited. fpaste --btrfsinfo Post the resulting URL. It'll expire in 24 hours. But if the problem file system is for sysroot, that will help better understand the storage stack, mount options, and recent btrfs messages. If the problem file system is not sysroot, you'll want to add --printonly and use the commands shown for each section on the proper mount point or device. >A reinstall will not > preserve /home, leading to unacceptable data loss. Hopefully there is a backup no matter what the file system is; and if not, creating a backup is the top priority in any disaster situation. There is a way to reinstall and preserve /home in Anaconda, but before doing that we really need to understand what's broken. Because if the file system is broken and can't be fixed, then it's mkfs time. And for that you need backups of at least the important user data. > I did an offline > btrfs check on my F33 machine that left the machine unbootable, so its > probably not an option either. I'm stuck at this point. btrfs check --readonly is safe, it's not touching anything on the drive at all --repair should at worst fail safe but it does still have rather scary warnings in the man page; it's best to consider --repair a last resort. You need to use other options before --repair, but we need to see the errors to know what to recommend. >Should I just > stop using the default BTRFS filesystem and go back to ext4? On the one hand, e2fsck has a pretty good chance of fixing damaged file system metadata resulting from storage stack problems, including hardware issues. But it doesn't check data integrity at all, and data is a much larger portion of what's written to a drive, so it's a much larger target for hardware problems resulting in corruption, dropped/torn/misdirected writes or even bit flips. Btrfs is intentionally fussier about these kinds of problems. And yeah it'll often just stop, to seek human attention what to do about it. That's pretty onerous, but it's also what protects your data from being damaged even worse. But anyway there's not much to go on here yet. We need to see dmesg for these problems. I personally prefer to see the entire dmesg because isolated errors don't tell me about what was going on immediately prior to the Btrfs error which is almost always a related factor. Mount options can matter too. In the raid1 case, same thing, need to see dmesg because that's where btrfs spits out all of it's complaints. And it is quite verbose. -- Chris Murphy _______________________________________________ users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure