On 7/6/20 12:07 AM, Chris Murphy wrote: > On Fri, Jul 3, 2020 at 8:40 PM Eric Sandeen <sandeen@xxxxxxxxxx> > wrote: >> >> On 7/3/20 1:41 PM, Chris Murphy wrote: >>> SSDs can fail in weird ways. Some spew garbage as they're >>> failing, some go read-only. I've seen both. I don't have stats on >>> how common it is for an SSD to go read-only as it fails, but once >>> it happens you cannot fsck it. It won't accept writes. If it >>> won't mount, your only chance to recover data is some kind of >>> offline scrape tool. And Btrfs does have a very very good scrape >>> tool, in terms of its success rate - UX is scary. But that can >>> and will improve. >> >> Ok, you and Josef have both recommended the btrfs restore >> ("scrape") tool as a next recovery step after fsck fails, and I >> figured we should check that out, to see if that alleviates the >> concerns about recoverability of user data in the face of >> corruption. >> >> I also realized that mkfs of an image isn't representative of an >> SSD system typical of Fedora laptops, so I added "-m single" to >> mkfs, because this will be the mkfs.btrfs default on SSDs (right?). >> Based on Josef's description of fsck's algorithm of throwing away >> any block with a bad CRC this seemed worth testing. >> >> I also turned fuzzing /down/ to hitting 2048 bytes out of the 1G >> image, or a bit less than 1% of the filesystem blocks, at random. >> This is 1/4 the fuzzing rate from the original test. >> >> So: -m single, fuzz 2048 bytes of 1G image, run btrfsck --repair, >> mount, mount w/ recovery, and then restore ("scrape") if all that >> fails, see what we get. > > What's the probability of this kind of corruption occurring in the > real world? If the probability is so low it can't practically be > computed, how do we assess the risk? And if we can't assess risk, > what's the basis of concern? >From 20 years of filesystem development experience, I know that people run filesystem repair tools. It's just a fact. For a wide variety of reasons - from bugs, to hardware errors, to admin errors, you name it, filesystems experience corruption and inconsistencies. At that point the administrator needs a path forward. "people won't need to repair btrfs" is, IMHO, the position that needs to be supported, not "filesystem repair tools should be robust." >> I ran 50 loops, and got: >> >> 46 btrfsck failures 20 mount failures >> >> So it ran btrfs restore 20 times; of those, 11 runs lost all or >> substantially all of the files; 17 runs lost at least 1/3 of the >> files. > > Josef states reliability of ext4, xfs, and Btrfs are in the same > ballpark. He also reports one case in 10 years in which he failed to > recover anything. How do you square that with 11 complete failures, > trivially produced? Is there even a reason to suspect there's > residual risk? Extrapolating from Facebook's usecases to the fedora desktop should be approached with caution, IMHO. I've provided evidence that if/when damage happens for whatever reason, btrfs is unable to recover in place far more often than other filesytems. > When metadata is single profile, Btrfs is basically an early warning > system.> The available research on uncorrectable errors, errors that drive ECC > does not catch, suggests that users are decently likely to experience > at least one block of corruption in the life of the drive. And that > it tends to get worse up until drive failure. But there is much less > chance to detect this, if the file system isn't also checksumming the > vastly larger payload on a drive: the data. One of the problems in this whole discussion is the assumption that filesystem inconsistencies only arise from disk bitflips etc; that's just not the case. Look, I'm just providing evidence of what I've found when re-evaluating the btrfs administration/repair tools. I've found them to be quite weak. >From what I've gathered from these responses, btrfs is unique in that it is /expected/ that if anything goes wrong, the administrator should be prepared to scrape out remaining data, re-mkfs, and start over. If that's acceptable for the Fedora desktop, that's fine, but I consider it a risk that should not be ignored when evaluating this proposal. -Eric _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx