On Mon, Dec 21, 2020, at 11:28 AM, Ben Cotton wrote: > https://fedoraproject.org/wiki/Changes/RPMCoW > > > == Summary == > > RPM Copy on Write provides a better experience for Fedora Users as it > reduces the amount of I/O and offsets CPU cost of package > decompression. RPM Copy on Write uses reflinking capabilities in > btrfs, which is the default filesystem in Fedora 33. > > == Owners == > > * Name: [[User:malmond|Matthew Almond]], [[User:dcavalca|Davide Cavalca]] > * Email: malmond@xxxxxx, dcavalca@xxxxxx > > > == Detailed description == > > Installing and upgrading software packages is a standard part of > managing the lifecycle of any operating system. For the entire > lifecycle of Fedora, all software is packaged and distributed using > the RPM file fomat. This proposal changes how software is downloaded > and installed, leaving the distribution process unmodified. > > === Current process === > > # Resolve packaging request into a list of packages and operations > # Download and verify new packages > # Install and/or upgrade packages sequentially using RPM files, > decompressing, and writing a copy of the new files to storage. > > === New process === > > # Resolve packaging request into a list of packages and operations > # Download and '''decompress''' packages into a '''locally optimized''' rpm file Please verify the signature on the downloaded RPM before decompressing it. (Do we do this already?) > # Install and/or upgrade packages sequentially using RPM files, using > '''reference linking''' (reflinking) to reuse data already on disk. Sounds like a great improvement! Any real-world data on how much time it saves, how much it changes disk usage, or how much SSD writes it saves? > > The outcome is intended to be the same, but the order of operations is > different. > > # Decompression happens inline with download. This has a positive > effect on resource usage: downloads are typically limited by > bandwidth. Decompression and writing the full data into a single file > per rpm is essentially free. Additionally: if there is more than one > download at a time, a multi-CPU system can be better utilized. All > compression types supported in RPM work because this uses the rpm I/O > functions. I referenced above, I think each chunk should also be verified before decompressing. > # RPMs are cached on local storage between downloading and > installation time as normal. This allows DNF to defer actual RPM > installation to when all the RPM are available. This is unchanged. > # The file format for RPMs is different with Copy on Write. The > headers are identical, but the payload is different. There is also a > footer. > ## Files are converted (“transcoded”) locally during download using > <code>/usr/bin/rpm2extents</code> (part of rpm codebase). The format > is not intended to be “portable” - i.e. copying the files from the > cache is not supported. I think these should be made to be portable. How many variants of these are there? Would it be difficult to make the transcoder also understand RPMs transcoded for a different platform/setup? Eventually, I'd like to see additional signatures added to the RPM for each of the variants so RPM itself can do the verification at install time, avoiding a transcode to the "canonical" format. (I suppose this might require a build-time or sign-time transcode to each of the other variants.) Until then, I'd like to ensure that the package signatures are being verified in a secure manner, which would be necessary for the plugin to be able to install packages not built with multiple signatures/digests. Would it be practical to just have a single format aligned to the largest page size known, leaving fs holes as necessary on systems with smaller page sizes? > ## Regular RPMs use a compressed .cpio based payload. In contrast, > extent based RPMs contain uncompressed data aligned to the fundamental > page size of the architecture, e.g. 4KiB on x86_64. This alignment is > required for <code>FICLONERANGE</code> to work. Only files are > represented in the payload, other directory entries like symlinks, > device nodes etc are constructed entirely from rpm header information. > Files are referenced by their digest, so identical files are > de-duplicated. How are hardlinks in an RPM handled? Do they stay as hardlinks or become reflinks only, losing the hardlink status? They should stay hardlinks, in my opinion. > ## The footer currently has three sections > ### Table of original (rpm) file digests, used to validate the > integrity of the download in dnf. > ### Table of digest → offset used when actually installing files. > ### Signature 8 bytes at the end of the file, used to differentiate > between traditional RPMs and extent based. I think this magic number "signature" should vary based on the items that cause the format to change. What happens if you try to use a transcoded RPM on a non-compatible system? > > === Notes === > > # The headers are preserved bit for bit during transcoding. This > preserves signatures. The signatures cover the main header blob, and > the main header blob ensures the integrity of data in two ways: > ## Each file with content has a digest. Originally this was md5, but > today it’s usually sha256. In normal RPM this is only used to verify > the integrity of files, e.g. <code>rpm -V</code>. With CoW we use this > as a content key. > ## There is/are one or two digests (<code>PAYLOADDIGEST</code> and > <code>PAYLOADDIGESTALT</code>) covering the payload archive > (compressed cpio). The header value is preserved, but transcoded RPMs > do not preserve the original structure so RPM’s pre-installation > verification (controlled by <code>%_pkgverify_level</code> will fail. > <code>dnf-plugin-cow</code> disables this check in dnf because it > verifies the whole file digest which is captured during > download/transcoding. The second one is likely used for delta rpm. > # This is untested, and possibly incompatible with delta RPM (drpm). > The process for reconstructing an rpm to install from a delta is > expensive from both a CPU and I/O perspective, while only providing > marginal benefits on download size. It is expected that having delta > rpm enabled (which is the default) will be handled gracefully. https://github.com/rpm-software-management/rpm/pull/880 added DIGESTALT, apparently to help reduce this CPU usage problem. I don't know if it's actually used by anything, but it is much newer than I'd have guessed (2019 October). > # Disk space requirements are expected to be marginally higher than > before: all new packages or updates will consume their installed size > before installation instead of about half their size (regular rpms > with payloads still cost space). > # <code>rpm-plugin-reflink</code> will fall back to simple file > copying when the destination path is not on the same > filesystem/subvolume. A common example is <code>/boot</code> and/or > <code>/boot/efi</code>. > # The system will still work on other filesystem types, but will > ''always'' fall back to simple copying. This is expected to be > slightly slower than not enabling CoW because the source for copying > will be the decompressed data. Any testing to see the speed impact? > # For systems that enable transparent filesystem compression: every > file will continue to be decompressed from the original rpm, and then > transparently re-compressed by the filesystem. There is no effective > change here. There is a future project to investigate alternate > distribution mechanics to provide parallel versions of file content > pre-compressed in a filesystem specific format, reducing both CPU > costs and I/O. It is expected that this will result in slightly higher > network utilization because filesystem compression is purposely > restricted to allow random I/O. > # Current implementation of <code>dnf-plugin-cow</code> is in Python, > but it looks possible to implement this in <code>libdnf</code> instead > which would make it work in <code>packagekit</code>. > > === Performance Metrics === > > Ballpark performance difference is about half the duration for file > download+install time. A lot of rpms are very small, so it’s difficult > to see/measure. Larger RPMs give much clearer signal. > > (Actual numbers/charts will be supplied in Jan 2021) Seems like a very nice optimization! Thanks for working on it! V/r, James Cassell _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx