Re: BTRFS dropped by RedHat

Chris Murphy <lists@xxxxxxxxxxxxxxxxx> · Mon, 7 Aug 2017 14:46:13 -0600

On Fri, Aug 4, 2017 at 9:12 AM, Przemek Klosowski
<przemek.klosowski@xxxxxxxx> wrote:
> The release notes for RHEL 7.4 announce that RedHat gave up on btrfs:
>
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/7.4_Release_Notes/chap-Red_Hat_Enterprise_Linux-7.4_Release_Notes-Deprecated_Functionality.html

I see it as acknowledgment Btrfs is stable enough that it's
nonsensical to keep on calling it a technology preview, and also not
explicitly supporting it for paying customers. Red Hat doesn't have
the developers to support it, so it quite literally has no choice but
to deprecate it.

In contrast, SUSE has had reliable atomic updates and rollbacks for
two years, based on Btrfs, in both their enterprise and community
distros. But they also have a bunch of upstream developers to support
it.

This is a gem. Consider this in-production scale example when deciding
whether Btrfs is stable.
https://www.spinics.net/lists/linux-btrfs/msg67308.html

Facebook (and others) are making substantial and growing use of Btrfs
in container deployments.
https://www.spinics.net/lists/linux-btrfs/msg67885.html

Good read on the Red Hat decision making sense for enterprise
workloads, and where current Btrfs problems are located:
https://www.spinics.net/lists/linux-btrfs/msg67940.html

In the past 18 months, there were 100 Btrfs, 71 ext4, and 63 XFS
contributors. There are thousands of line changes per kernel release
cycle for Btrfs. It's in very active development, so Fedora people
don't need to confuse business decisions related to support contracts,
with what technologies Fedora should use.

I mention here a basis for Fedora using Btrfs where it can earn its keep.
https://pagure.io/atomic-wg/issue/306

On the bug front, I monitor all the fs lists and linux-raid@ and
strictly speaking none are stable in that none are static unchanging
targets. All of them are adding features, which then have some bugs
and cause regressions in edge cases, and there's some fix cycles. I've
been hit with file system bugs on multiple platforms over two decades,
and right now I trust Btrfs for my data more than all except maybe ZFS
and that's just because I had to flip a coin at some point, and now I
know where the bodies are buried with Btrfs. Users have gotten hit
with bugs with ZFS on Linux, and when it goes badly there's no fsck at
all so you're just hosed. So yeah, bugs are annoying, file systems are
hard, backsups are good.

Enospc is largely solved on Btrfs, kernel 4.1 added automatic
deallocation of empty block groups, and 4.8 added ticketed enospc
infrastructure; there is still a super annoying significant minority
(maybe bigger than just an edge case) that get sucked into
micromanaging their file systems to avoid the remaining instances.
That's an active discussion on linux-btrfs@ how to solve this with a
user space policy rather than continuing to wait for a smarter
deallocation/reuse block groups policy. Anyway this is definitely a
lot better, it will continue to get better.

The raid56 parity scrub bug is annoying, but the user was always in a
net better position with Btrfs despite it than the same situation with
md, lvm, or hardware RAID. The precondition is a data strip that's
corrupt (not by Btrfs but just ordinary bad sector or other course in
the stack)  -> from there Btrfs detects this corruption during scrub,
reconstructs good data from parity, passes good data to the
application and overwrites good data to disk to fix the corruption,
and then sometimes due to a race a wrong recomputation of parity
happens which is then written to disk. So bad parity replaces good
parity. But in that same scenario, md, lvm, and hardware raid
propagate corrupt data to the application and if it's a parity
overwrite rather than just a read only check for mismatches, bad
parity is also written to disk. Back to Btrfs, a 2nd scrub fixes the
problem; and during normal reads bad parity is a non-factor; if the
stripe the bad parity strip is a part of is degraded (bad sector,
failed drive) requiring reconstruction, yep you get bad reconstruction
due to bad parity but Btrfs catches this due to data checksums and we
get EIO not propagation of corrupt data to the application.

OK this is probably long enough now.

-- 
Chris Murphy
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx