Re: Fedora 33 System-Wide Change proposal: Make btrfs the default file system for desktop variants

Chris Murphy <lists@xxxxxxxxxxxxxxxxx> · Sat, 11 Jul 2020 09:15:53 -0600

On Fri, Jul 10, 2020 at 11:14 AM Vitaly Zaitsev via devel
<devel@xxxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> On 26.06.2020 16:42, Ben Cotton wrote:
> > ** transparent compression: significantly reduces write amplification,
> > improves lifespan of storage hardware
>
> What can you say about this? https://arxiv.org/pdf/1707.08514.pdf

The paper states its bias in the conclusion. It is a conjecture.
They're trying to demonstrate using the worst case possible scenario
testing of file systems in use (they do in fact behave this way) that
a new file system needs to be developed, and for the use case they
have in mind all of the evaluated general purpose file systems are
disqualified. If you aren't looking to disqualify all general purpose
file systems for your use case, this is not the paper for you.

Intentionally not explored, are various file system optimizations to
mitigate this problem and real world general purpose workloads. In the
case of Btrfs, those include delayed allocation, treelog, inline
extents, and the default 16KiB leaf size.

The paper discounts entirely the workloads where fsync() isn't used.
The paper admits this. "We should note that write amplification is
high in our workloads because we do small writes followed by a
fsync()." Many small file writes on a general purpose file system are
quite a lot less than this, and on Btrfs many of those writes will be
inline extents. i.e. they are stored inside the 16KiB leaf along with
their inode entry. In the case of many recurring writes, the actual
write pattern coalesces many file changes into the same leaf that's
going to be written anyway. Yes, there is a big hit for that first
write, but all the other writes are cheaper, maybe even free, if they
happen inside the commit window. It's also a good reason to not
fsync() the heck out of everything needlessly.

Finally, they are only looking at metadata writes. This is a tiny
amount of writes compared to the data payload. Any compression of data
will produce overwhelming reduction on net write amplification.

If we look at another paper with a different bias that's already been
cited in devel@ discussions, "Evaluating File System Reliability
on Solid State Drives" by Jaffer, et al - they say "Most notably Btrfs
[46], a copy-on-write file system which is more suitable for SSDs
with no in-place writes, has garnered wide adoption. The design of
Btrfs is particularly interesting as it has fewer total writes than
ext4’s journaling mechanism." How do we square this statement with the
previous paper? They are looking at different workloads.

-- 
Chris Murphy
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx