On Fri, Aug 4, 2017 at 9:12 AM, Przemek Klosowski <przemek.klosowski@xxxxxxxx> wrote: > The release notes for RHEL 7.4 announce that RedHat gave up on btrfs: > > https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/7.4_Release_Notes/chap-Red_Hat_Enterprise_Linux-7.4_Release_Notes-Deprecated_Functionality.html I see it as acknowledgment Btrfs is stable enough that it's nonsensical to keep on calling it a technology preview, and also not explicitly supporting it for paying customers. Red Hat doesn't have the developers to support it, so it quite literally has no choice but to deprecate it. In contrast, SUSE has had reliable atomic updates and rollbacks for two years, based on Btrfs, in both their enterprise and community distros. But they also have a bunch of upstream developers to support it. This is a gem. Consider this in-production scale example when deciding whether Btrfs is stable. https://www.spinics.net/lists/linux-btrfs/msg67308.html Facebook (and others) are making substantial and growing use of Btrfs in container deployments. https://www.spinics.net/lists/linux-btrfs/msg67885.html Good read on the Red Hat decision making sense for enterprise workloads, and where current Btrfs problems are located: https://www.spinics.net/lists/linux-btrfs/msg67940.html In the past 18 months, there were 100 Btrfs, 71 ext4, and 63 XFS contributors. There are thousands of line changes per kernel release cycle for Btrfs. It's in very active development, so Fedora people don't need to confuse business decisions related to support contracts, with what technologies Fedora should use. I mention here a basis for Fedora using Btrfs where it can earn its keep. https://pagure.io/atomic-wg/issue/306 On the bug front, I monitor all the fs lists and linux-raid@ and strictly speaking none are stable in that none are static unchanging targets. All of them are adding features, which then have some bugs and cause regressions in edge cases, and there's some fix cycles. I've been hit with file system bugs on multiple platforms over two decades, and right now I trust Btrfs for my data more than all except maybe ZFS and that's just because I had to flip a coin at some point, and now I know where the bodies are buried with Btrfs. Users have gotten hit with bugs with ZFS on Linux, and when it goes badly there's no fsck at all so you're just hosed. So yeah, bugs are annoying, file systems are hard, backsups are good. Enospc is largely solved on Btrfs, kernel 4.1 added automatic deallocation of empty block groups, and 4.8 added ticketed enospc infrastructure; there is still a super annoying significant minority (maybe bigger than just an edge case) that get sucked into micromanaging their file systems to avoid the remaining instances. That's an active discussion on linux-btrfs@ how to solve this with a user space policy rather than continuing to wait for a smarter deallocation/reuse block groups policy. Anyway this is definitely a lot better, it will continue to get better. The raid56 parity scrub bug is annoying, but the user was always in a net better position with Btrfs despite it than the same situation with md, lvm, or hardware RAID. The precondition is a data strip that's corrupt (not by Btrfs but just ordinary bad sector or other course in the stack) -> from there Btrfs detects this corruption during scrub, reconstructs good data from parity, passes good data to the application and overwrites good data to disk to fix the corruption, and then sometimes due to a race a wrong recomputation of parity happens which is then written to disk. So bad parity replaces good parity. But in that same scenario, md, lvm, and hardware raid propagate corrupt data to the application and if it's a parity overwrite rather than just a read only check for mismatches, bad parity is also written to disk. Back to Btrfs, a 2nd scrub fixes the problem; and during normal reads bad parity is a non-factor; if the stripe the bad parity strip is a part of is degraded (bad sector, failed drive) requiring reconstruction, yep you get bad reconstruction due to bad parity but Btrfs catches this due to data checksums and we get EIO not propagation of corrupt data to the application. OK this is probably long enough now. -- Chris Murphy _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx