On Monday 20 April 2009, James Antill wrote: > Ville Skyttä <ville.skytta@xxxxxx> writes: > > > Regarding CPU requirements, xz/lzma should be much better on metadata > > consumer boxes than bzip2, and somewhat more memory intensive but I doubt > > this would matter much if any at all as long as lzma compression levels > > are kept at sane values. > > The 25-35% savings were on .sqlite is at -9 ... what do you mean by > "sane values" here. That would need to be tested with typical sqlite repodata, but based on the lzma benchmarks URL I posted (http://tukaani.org/lzma/benchmarks) I'd guess levels <= 7. > Sure, people like choice in lots of things, but those choices have to > be paid for. For instance some people like to choose to access rawhide > from apt, or a random RHEL-5 version of yum. > So do we now keep N versions of all the .sqlite files, for each > compression flavor and allow people to choose how many N versions > (forwards and backwards) to generate? IMHO it is fair enough to let people with such corner cases know that they'll be served with XML metadata or need to upgrade their depsolver (or install another copy on the side of the distro one for this purpose). > -- Content-Encoding didn't work so well with that much choice. Just out of interest, why was that? Due to it requiring special web server config and not being available for other protocols than HTTP or something else? (I'm assuming you don't mean compressing the database on the fly - I can see why that wouldn't be feasible.) > > e.g. even if the CPU/memory requirements would be a problem > > for boxes composing something large like Fedora Rawhide all the time, at > > least for immutable final release repos it should be doable, ditto for > > many scenarios between these extremes. > > Exactly the opposite, IMNSHO. I download rawhide metadata a couple of > times a week ... Yes, of course that would be the scenario benefiting most of the improved compression. But as I tried to explain above, if you can't have that, there are still other use cases that could benefit. > I download "fedora" metadata somewhere between 0 and > 1 times. I'd be happy with no compression at all there, I think. Yes, there are quite a few different scenarios. But to be a representative for a general use case, I think you've been spoiled by too fast network connectivity if given the choice you'd be happy with no compression even for those infrequently downloaded files. The space savings would be useful on DVD images etc as well. > > Regarding code requirements, if yum devs don't feel like implementing it, > > I'm sure the code will just magically appear somewhere if there's a clear > > green light given by the yum devs and when xz and its python bindings > > reaches a stable release. > > It's not like we know what the code will look like, although we can > imagine. For instance if you think it's adding an import or two and > doing some code in yum like: > > if url.endswith(".lz"): uncompress_lzip() > if url.endswith(".bz2"): uncompress_bzip2() > if url.endswith(".gz"): uncompress_gzip() I haven't really even thought about it and it's pretty unlikely that I will spend time on doing that if there are no stronger hints of a buy-in from the yum devs (and I don't promise anything anyway at this point), but: > ...then it's unlikely I'd commit it, because that's just the tip of > the iceberg. I think it would be useful for interested parties if you could elaborate on that iceberg in a couple of more lines, and/or URLs pointing to documentation that explains it. _______________________________________________ Yum mailing list Yum@xxxxxxxxxxxxxxxxx http://lists.baseurl.org/mailman/listinfo/yum