Re: Fedora 34 Change: Enable btrfs transparent zstd compression by default (System-Wide Change proposal)

Chris Murphy <lists@xxxxxxxxxxxxxxxxx> · Tue, 16 Feb 2021 17:42:53 -0700

On Tue, Feb 16, 2021 at 4:10 PM Jeremy Linton <jeremy.linton@xxxxxxx> wrote:

> On 2/14/21 2:20 PM, Chris Murphy wrote:

> > This isn't sufficiently qualified. It does work to reduce space
> > consumption and write amplification. It's just that there's a tradeoff
> > that you dislike, which is IO reduction. And it's completely
> > reasonable to have a subjective position on this tradeoff. But no
> > matter what there is a consequence to the choice.
>
> IO reduction in some cases (see below), for additional read latency, and
> and increase in CPU utilization.
>
> For a desktop workload the former is likely a larger problem. But as we
> all know sluggishness is a hard thing to measure on a desktop. QD1
> pointer chasing on disk though is a good approximation, sometimes boot
> times are too.

What is your counter proposal?

> > A larger file might have a mix of compressed and non-compressed
> > extents, based on this "is it worth it" estimate. This is the
> > difference between the compress and compress-force options, where
> > force drops this estimator and depends on the compression algorithm to
> > do that work. I sometimes call that estimator the "early bailout"
> > check.
>
> Compression estimation is its own ugly ball of wax. But ignoring that
> for the moment, consider what happens if you have a bunch of 2G database
> files with a reasonable compression ratio. Lets assume for a moment the
> database attempts to update records in the middle of the files. What
> happens when the compression ratio gets slightly worse? (its likely you
> already have nodatacow).

What percentage of Fedora desktop users do you estimate have a bunch
of 2G database files?

I don't assume datacow or nodatacow for databases, because some
databases and their workloads do OK on COW filesystems and others
don't.

Also, nodatacow disables compression. i.e. files having file attribute
'C' (nodatacow) with mount option compress(-force) remain
uncompressed.

> Although this becomes a case of
> seeing if the "compression estimation" logic is smart enough to detect
> its causing poor IO patterns (while still having a reasonably good
> compression ratio).

The "early bail" heuristic just tries to estimate if the effort of
compression is worth it. If it is, the data extent is submitted for
compression and if it's not worth it, it isn't. The max extent size
for this is 128KiB. There's no IO pattern detection. Once the
compression has happened, the write allocator works the same as
without compression.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/btrfs/compression.c?h=v5.11#n1314
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/btrfs/compression.c?h=v5.11#n1609

> In a past life, I spent a non inconsequential part of a decade
> engineering compressed ram+storage systems (similar to what has been
> getting merged to mainline over the past few years). Its really hard to
> make one that is performant across a wide range of workloads. What you
> get are areas where it can help, but if you average those case with the
> ones where it hurts the overwhelming analysis is you shouldn't be
> compressing unless you want the capacity. The worse part is that most
> synthetic file IO benchmarks tend to be on the "it helps" side of the
> equation and the real applications on the other.

This is why I tend to poo poo on benchmarks. They're useful for the
narrow purpose they're intended to measure. Synthetic benchmarks are
good at exposing problems, but won't tell you their significance, so
what they expose is the need for better testing. A databased benchmark
will do a good job showing performance issues with workloads that act
like the database that the benchmark is mimicking. Not all databases
have the same behavior.

> IMHO if fedora wanted to take a hit on the IO perf side, a much better
> place to focus would be flipping encryption on. The perf profile is
> flatter (aes-ni & the arm crypto extensions are common) with fewer evil
> edge cases. Or a more controlled method might to be picking a couple
> fairly atomic directories and enabling compression there (say /usr).

Workstation WG has been tracking these:
https://pagure.io/fedora-workstation/issue/136
https://pagure.io/fedora-workstation/issue/82

A significant impediment to ticket the "Encrypt my data" checkbox by
default in Automatic partitioning is the UI/UX. The current evaluation
centers on using systemd-homed to encrypt user data by default; and
optionally enabling system encryption with the key sealed in the TPM,
or protected on something like a yubikey. There's still some work to
do to get this integrated.

-- 
Chris Murphy
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure