Re: Fedora 34 Change: DNF/RPM Copy on Write enablement for all variants (System-Wide Change)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 21, 2020 at 10:49 AM Colin Walters <walters@xxxxxxxxxx> wrote:
>
>
>
> On Mon, Dec 21, 2020, at 11:28 AM, Ben Cotton wrote:
> > ## Regular RPMs use a compressed .cpio based payload. In contrast,
> > extent based RPMs contain uncompressed data aligned to the fundamental
> > page size of the architecture, e.g. 4KiB on x86_64. This alignment is
> > required for <code>FICLONERANGE</code> to work. Only files are
> > represented in the payload, other directory entries like symlinks,
> > device nodes etc are constructed entirely from rpm header information.
>
> This is the core change; some interesting tradeoffs here.  Python projects in particular ship a lot of files smaller than 4k (classic example is `__init__.py` which is zero sized).  And ppc64le is 64KiB pages right?  So there will be "zero space" to align, right?  Would need some math to see how much this would add up to, although I guess the implementation could instead use holes?

I'm not sure about XFS or ext4 zero length file handling.

On Btrfs, it's a few hundred bytes. The file has no EXTENT_DATA item,
therefore it's the same whether you write a new zero length file or
reflink copy it.

Files bigger than 0 bytes but less than 2KiB will tend to result in
inline extents, i.e. EXTENT_DATA item contains the data in the same
metadata leaf as the inode rather than referencing some 4KiB data
block elsewhere.

Hardlinks take around 100 bytes, they are slightly more efficient space
wise. But can't have separate selinux labels, acl, permissions, or be
located in different subvolumes, and max hardlinks 65536 per file).
Reflinks don't have those limitations.

>
> > Files are referenced by their digest, so identical files are
> > de-duplicated.
>
> But just inside a single RPM, right?  It's interesting to compare with ostree which does this by default; conceptually this is using reflinks inside a single RPM to do what ostree does system wide with hardlinks.
>
> BTW we learned a few things, notably zero sized files are tricky because there can be a *lot* of them - see e.g. https://github.com/ostreedev/ostree/pull/2197
> That one was too many hardlinks, but how well do filesystems like btrfs/xfs handle thousands of reflinks instead?  The Python __init__.py thing is such a pathological case...

Thousands aren't a problem, nor are tens of thousands. A reflink is a
normal file that just so happens to have extents shared with another
file. It's the shared extent part that makes them sorta special, but
there's nothing in the structure of the file that says it's a reflink.
Whereas for a symlink or hard link, there is.

Shared extents are also produced by snapshots and dedup. It's the same
on-disk manifestation in all three cases. And at least on Btrfs there
are examples of millions of shared extents. But the workload will
dictate the extent layout, to what degree extents are shared, become
unshared, result in COW for modifications, and how much file and free
space fragmentation ensues. Those can be much bigger issues than the
number of reflinks.


--
Chris Murphy
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux