Re: Fedora 33 System-Wide Change proposal: Make btrfs the default file system for desktop variants

Josef Bacik <josef@xxxxxxxxxxxxxx> · Fri, 26 Jun 2020 15:22:07 -0400

On 6/26/20 2:58 PM, James Szinger wrote:
On Fri, 26 Jun 2020 12:30:02 -0500
Chris Adams <linux@xxxxxxxxxxx> wrote:
So... I freely admit I have not looked closely at btrfs in some time,
so I could be out of date (and my apologies if so).  One issue that I
have seen mentioned as an issue within the last week is still the
problem of running out of space when it still looks like there's
space free.  I didn't read the responses, so not sure of the
resolution, but I remember that being a "thing" with btrfs.  Is that
still the case?  What are the causes, and if so, how can we keep from
getting a lot of the same question on mailing lists/forums/etc.?

Yes, it happened to me last week.  The workstation has been upgraded
since F25 and is now at F31.  A yum update last week ran a restorecon
-r / which filled up the filesystem and RAM and swap.  The 460 GB
filesystem had about 140GB of real data, 100 GB of data bloat from
underfull blocks, and the rest (200GB) was metadata.  I had to boot
from a live USB and run btrfs balance to free up the bloat.  I expect
to reformat it to ext4 when the quarantine is over.

This is my last BTRFS filesystem.  One was on a laptop hard disk that
was painfully slow, especially when compared with it's ext4 twin
sitting next to it.  It was reformatted to ext4.  I also had a BTRFS
RAID 0 hard disk array.  It was also slow and also ended up needing
rescue.  I converted it over to xfs on MD raid and it's been faster
and perfectly reliable ever since.

While I like subvolumes and snapshots, I find the maintenance,
reliability, and performance overhead to be not worth it.

Not recommended.

Generally speaking btrfs performance has been the same if not better for our 
workloads.  This is millions of boxes with thousands of different workloads and 
performance requirements.

That being said I can make btrfs look really stupid on some workloads.  There's 
going to be cases where Btrfs isn't awesome.  We still use xfs for all our 
storage related tiers (think databases).  Performance is always going to be 
workload dependent, and Btrfs has built in overhead out the gate because of 
checksumming and the fact that we generate far more metadata.

As for your ENOSPC issue, I've made improvements on that area.  I see this in 
production as well, I have monitoring in place to deal with the machine before 
it gets to this point.  That being said if you run the box out of metadata space 
things get tricky to fix.  I've been working my way down the list of issues in 
this area for years, this last go around of patches I sent were in these corner 
cases.

I described this case to the working group last week, because it hit us in 
production this winter.  Somebody screwed up and suddenly pushed 2 extra copies 
of the whole website to everybody's VM.  The website is mostly metadata, because 
of the inline extents, so it exhausted everybody's metadata space.  Tens of 
thousands of machines affected.  Of those machines I had to hand boot and run 
balance on ~20 of them to get them back.  The rest could run balance from the 
automation and recover cleanly.

It's a shit user experience, and its a shitty corner case that still needs work. 
 It's a top priority of mine.  Thanks,

Josef
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx