On Thu, 2020-07-09 at 12:56 -0700, Eric Sandeen wrote: > On 7/9/20 2:11 PM, Josef Bacik wrote: > > > From what I've gathered from these responses, btrfs is unique in that it is > > > /expected/ that if anything goes wrong, the administrator should be prepared > > > to scrape out remaining data, re-mkfs, and start over. If that's acceptable > > > for the Fedora desktop, that's fine, but I consider it a risk that should not > > > be ignored when evaluating this proposal. > > > > > > > Agreed, it's the very first thing I said when I was asked what are the downsides. There's clearly more work to be done in the recovery arena. How often do disks fail for Fedora? Do we have that data? Is this a real risk? Nobody can say because Fedora doesn't have data. > > But again, let me reiterate that disk failures are far from the only > reason that admins need capable filesystem repair tools, in general. > > We see users running fsck all the time, for various reasons. I can't > back it up, but my hunch is that bugs and misconfigurations (i.e. write > cache) are more often the root cause for filesystem inconsistencies. > > IMHO, focusing on physical disk failure rates is focusing too narrowly, > but I suppose I'm just joining the chorus of hunches and anecdotes now. Anecdata, but I use raid-1 on all my disks (since a catastrophic failure 20 years ago) and that shielded me from all disk failures since then (although I may have had silent corruption during the years I never lost any really important data that way, some picture may have got lost that way probably but it has been inconsequential for me). However I have had bad kernels, power outages, loss of battery power (laptops on too long suspend) and other random reasons to force reboot a system. That has been the primary case of file system checks through my Fedora usage. And luckily so far I never had a loss of filesystem or data that way, fsck always ended up solving most of the issues, and whenever I lost file they ended up being temporary files I did not care for. I do not think those failures are common in Facebook fleets, so I am quite skeptical FB data and failure modes are representative of Fedora usage as a desktop/laptop OS and therefore of the behavior of btrfs in those cases. Note, not saying btrfs should be avoided or anything, just that we need more data about those failure modes and how they affect btrfs before a change of defaults. My 2c, Simo. -- Simo Sorce RHEL Crypto Team Red Hat, Inc _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx