Re: Fedora 33 System-Wide Change proposal: Make btrfs the default file system for desktop variants

Chris Murphy <lists@xxxxxxxxxxxxxxxxx> · Sat, 11 Jul 2020 09:59:57 -0600

On Fri, Jul 10, 2020 at 1:45 PM Tomasz Torcz <tomek@xxxxxxxxxxxxxx> wrote:
>
> On Fri, Jul 10, 2020 at 07:14:09PM +0200, Vitaly Zaitsev via devel wrote:
> > On 26.06.2020 16:42, Ben Cotton wrote:
> > > ** transparent compression: significantly reduces write amplification,
> > > improves lifespan of storage hardware
> >
> > What can you say about this? https://arxiv.org/pdf/1707.08514.pdf
>
>   Also funny note: when compression was introduced in ZFS, circa 2007,
> it was mainly promoted as _performance_ win, not a space saving measure.
> This was still 5 years before NVMe, so all we had was SATA, SAS and FC
> drives, yet the CPUs were already multi-core and multi-gigahertz.
> Transfering uncompressed data was _slower_ than compressing/decompressing
> and having to transfer less data.  For a bit higher CPU usage we got
> noticeable bandwidth wins.
>   The tradeoff is no longer there, as single drives reach 7GiB/s
> transfer speed.

It would need to be benchmarked. The CPU in these cases has also
improved dramatically, perhaps more significantly than storage
performance. In which case, the compression may still not be a
limiting factor. lzbench is useful for this. Compiling it on Fedora is
straight forward but needs this hint or some improved understanding of
the problem

https://github.com/inikep/lzbench/issues/69

Note, you should use -b 128K since the Btrfs compress block size is
128KiB. There are a variety of corpuses available, I use silesia.tar

http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia

But you can also just tar /usr or /home.

There is error introduced with this benchmark. Btrfs compression is
per file. Any files less than 128K tend to have lower compression
ratio, so there is an overestimate of compression by lzbench in this
regard; whereas there's btrfs inline extents possible and in that
regard the compression is underestimated (or more correctly the actual
cost of the write). Another error is single thread vs multiple thread
compression, and single queue vs multi queue block device. Another
error is lzbench has essentially no latency, it's just one file being
tested. Whereas real world usage there's many files being read and
written, each with latency, during which time compression can happen
for essentially no additional latency cost. But not always for no
cost. So it's actually really complicated and probably why no one
really wants to do this kind of detailed benchmarking analysis. We're
probably better off making a new benchmark based on ordinary things:
compiling the kernel, launching applications, doing updates, git
updating and git log searching, etc. But even that is just a guess.

That reminds me: a git based approach for aging a file system.
https://www.usenix.org/system/files/hotstorage19-paper-conway.pdf
https://github.com/saurabhkadekodi/geriatrix

I haven't messed around with that, but maybe someone wants to turn
that into a how to. I'll do the testing if no one wants to burn their
SSD with writes. I've got a Samsung 840 EVO on an old laptop that I'm
actively trying to kill off.

Something that isn't accountable without blind studies involving
users, is some latencies users are hyper sensitive to and other
latencies they aren't at all sensitive to. I haven't dug up any
research on this, but I imagine it has been. Apple did a bunch of UI
changes early in the Mac OS X development cycle and while overall
latencies were lower as a result of having an (almost) preemptive
multitasking OS instead of the former cooperative multitasking OS, the
GUI had so much "eye candy" special effects that users got pissed at
how slow the OS seemed.

-- 
Chris Murphy
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx