Re: Fedora 34 Change: DNF/RPM Copy on Write enablement for all variants (System-Wide Change)

Matthew Almond via devel <devel@xxxxxxxxxxxxxxxxxxxxxxx> · Tue, 22 Dec 2020 21:41:35 -0000

> On Mon, Dec 21, 2020, at 1:07 PM, Neal Gompa wrote:
> 
> Yes it does.  It avoids writing the compressed data and then copying it back out
> uncompressed, which is the same amount of savings as the reflink approach.
> 
> (It's also equally incompatible with deltarpm)
> 
> 
> No - static deltas exist, plus layered RPMs work on the wire the same.  But this isn't
> really relevant here.
> 
> 
> Adding a hardlink indeed requires updating inodes proportional to the number of files, but
> that's more an implementation of the transactional update approach, not of the
> "download and unpack in parallel" part which is more what we're discussing
> here.  (Though they are entangled a bit)
> 
> Anyways, I'd still stand by my summary that the much lower tech "files in
> temporary directory that get rename()d" approach would be all of *more* efficient on
> disk, simpler to implement and much less disruptive than an RPM format change.  (The main
> cost would be a new temporary directory path that would need cleanup as part of e.g. `yum
> clean` etc.)

I'm replying to a bunch of topics in the same thread (via the web ui because I wasn't subscribed to the mailing list until today, yikes)

On editions: I wrote fedora-workstation because that's the same one that has btrfs as root by default

Zero byte files: I think reflinking is specifically fine here because reflinking is about contents, not inodes. A zero byte reflink should be a no-op (on the filesystem level, but I should check, if it's not, I can special case it easily enough). The process of installing files based on reflinks involves actually opening new files, then reflinking content.

On small files and alignment/waste: I believe most mutable filesystems do "waste some space". I call it out here because it's explicitly in the file format, the same as in .tar (without compression) and it's because FICLONERANGE and the filesystems demand it. I account for it as (number of files) x (native block size) / 2 - i.e. assume 50% usage of the tail of every file. The block size of ppc64 is unfortunate, but I expect the same level of waste happens whether you're using reflinking or not.

Talking about the topic more broadly:

The hardlinking approach in rpm-ostree depends on either a completely read-only system, or the use of a layered filesystem like overlayfs. I think it's a completely valid approach, and to my understanding, is the technology that underpins Fedora CoreOS and Project Atomic. These are different distro builds and have specific use cases in mind. As I understand it, they also have very different management policies: they are intended to be managed in a specific way, and that updates seem to require a reboot.

My hope for CoW for RPM is to bring a similar set of capabilities and benefits to Fedora, and eventually CentOS, RHEL without requiring any changes to how the system works or is managed. The new requirements are fairly simple: one filesystem for the rootfs and dnf cache, and that this filesystems supports reflinking.

Today data deduplication is within a given rpm. Looking forwards, I would like to extend the rpm2extents processor to read and re-use other blocks from the dnf/rpm cache and then we get full system level de-duplication.

I am really grateful for all this feedback, hopefully what I write makes sense - Matthew.
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx