Re: Proposal: Faster composes by eliminating deltarpms and using zchunked rpms instead

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2018-11-19 at 11:41 +0100, Sheogorath wrote:
> On 11/18/18 7:19 PM, Stephen John Smoogen wrote:
> > On Sun, 18 Nov 2018 at 12:49, Neal Gompa <ngompa13@xxxxxxxxx> wrote:
> > > On Sun, Nov 18, 2018 at 11:54 AM Jonathan Dieter <jdieter@xxxxxxxxx> wrote:
> > > > On Sat, 2018-11-17 at 22:30 +0100, Kevin Kofler wrote:
> > > > > Jonathan Dieter wrote:
> > > > > > My proposal would be to make zchunk the rpm compression format for
> > > > > > Fedora.
> > > > > 
> > > > > Given that:
> > > > > 1. zchunk is based on zstd, which is typically less efficient in terms of
> > > > >    compression ratio than xz, depending on settings
> > > > >    (see, e.g., https://github.com/inikep/lzbench), and
> > > > > 2. zchunk can by design only compress chunks individually and not benefit
> > > > >    from the space savings of a solid archive with a global dictionary,
> > > > > I fear that this is going to significantly increase the size of the RPMs,
> > > > > which matters:
> > > > > * for the initial downloads,
> > > > > * for storage (e.g., keepcache=1, local mirrors, etc.), and
> > > > > * for the people not using deltas for whatever reason.
> > > > > 
> > > > > I think zchunk makes a lot of sense for the metadata, but I am not convinced
> > > > > that it is the right choice for the RPMs themselves.
> > > > 
> > > > I suspect the first is true, but zchunk does actually allow for a
> > > > global (per-file) dictionary that can be used to compress each chunk.
> > > > The difficulty is that the dictionary has to stay the same between file
> > > > versions, or the chunk checksums won't match.  There would have to be
> > > > some thought put into how to generate and store the dictionaries.
> > > > 
> > > > As for how much bigger a zchunked rpm will be compared to an xz rpm, at
> > > > the moment it's a bit hand-wavy.  Based on zchunked repodata work I've
> > > > done, I think we might be looking at a size that's slightly smaller
> > > > than a gzipped rpm.  I won't know for sure until I put together a
> > > > proof-of-concept, but I want to make sure that there aren't any gaping
> > > > holes in the proposal before I do that.
> > > > 
> > > 
> > > I did some work several months ago to evaluate zstd compression for
> > > RPMs for Fedora, because of the lower memory and CPU usage for
> > > (de)compression. However, the average size increase from xz was pretty
> > > large (~20% or more on average, and nothing ever was either the same
> > > or smaller), even with heavier compression settings. That might have
> > > changed a bit with newer zstd releases that offer some more tunables,
> > > but I think it'll remain a tough sell on disk space.
> > 
> > So there are at least 4 legs here:
> > CPU usage (in both uncompression install and deltarpm)
> > Memory usage per transaction
> > Network amount
> > Disk amount
> > 
> > I expect that the best we are going to get in any 'improvement' is
> > going to be 3 out of the 4. The xz compression and delta-rpm has a
> > cpu/memory tradeoff for disk and network in comparison to gzip but it
> > is mostly acceptable if you have fairly modern desktops. However for
> > older hardware or lower power systems that tradeoff may not be good.
> > 
> 
> Good point. Given that we start to have Fedora IoT we have to look at
> those creatures. IoT devices hate heavy RAM usage, hate disk usage are
> half way okay with CPU usage (but keep in mind it may take an hour to
> decompress) and depending on the upstream, either use mobile data for
> networking or when you're lucky some WiFi/Bluetooth/… thing.
> 
> Means:
> CPU usage: Getting worse here, doesn't hurt too much
> Memory usage: Don't! Get! Worse!
> Network amount: Well, people wouldn't be happy when it gets worse, but
> mobile data gets cheaper every day.
> Disk amount: People won't be happy with an increase here, but as long as
> it stays somewhere within 10% it's fine, more than 20% would already
> hurt a lot.
> 
I'd like to add another perspective - network installations.

During a network installation of Fedora, the installer can only use the available RAM
to both run and store and data it needs before starting the installation. 
Only after the installation is startedf and storage is partitioned, it can
(if the system has a swap partition) relieve some of the memory pressure to
persistent storage.

Many people might think RAM would not be an issue in 2018, but in practice there are
and likely always will be memory constrained installation targets, such as massive deployments
of "small" VMs or the IoT use cases mentioned above.

So even though a network installation process is unlikely to actually do
delta package reconstructions against an older version, it would bee good
if memory requirements just for simple non-delta download, verification and
package installation would remain sane.


> So when we want to revisit RPM, we should keep our new fellows in mind.
> Maybe we get some OSTree magic going? There we already see deltas
> between versions and we get chunks.
> 
> _______________________________________________
> devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
> To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
> Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux