Re: metadata compression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I don't know the roadmap for yum, so I didn't realize that sqlite files was the way to go. It does make sense, though.

I won't go into details about the RHEL 3 and RHEL 4 yum setup I have...

I have posted 3 compressed versions of other.xml from rhel-i386-server-5 here:

http://thejoshwa.com/upload/other.xml.7z
http://thejoshwa.com/upload/other.xml.bz2
http://thejoshwa.com/upload/other.xml.gz

All were compressed using the maximum compression available for each (gzip --best, bzip2 --best, 7z a -t7z -mx=9 -m0=lzma).

You can see the difference for yourself.

Uncompressed   128946449
7zip                   2014112
bzip2                 22233028
gzip                  32705803


On Sun, Apr 19, 2009 at 11:03 PM, James Antill <james-yum@xxxxxxx> wrote:
Joshua Bahnsen <archrival@xxxxxxxxx> writes:

> I am creating repository data based on ALL rpms available to a specific Red
> Hat channel (6000 or so per channel)
>
> rhel-i386-as-3
> rhel-i386-es-3
> rhel-i386-ws-3
> rhel-i386-as-4
> rhel-i386-es-4
> rhel-i386-ws-4

 There's little or nothing yum can do for these.

> rhel-i386-client-5
> rhel-i386-server-5

 These we can probably try and help with, but we've been asking and
waiting for 12+ months for RHN and CentOS to move to generating
.sqlite files server side. So I wouldn't bet that we can help in the
general case, quickly. Plus any client side support for lzma probably
wouldn't get into 5.x until at least 5.5 (more likely 5.6 or 5.7).
 So realistically you are targeting Fedora and 6.x for a change like
this.

[...]

> With rhel-i386-as-4, other.xml is nearly 300 MB uncompressed, with gzip it
> is 66 MB, with lzma on max compression is 2.4 MB.

 Trying to do a mental s/4/5/

 Ok, what is the .sqlite size ... what is bzip2 vs. lzma on that?

 Can you post your *.xml files somewhere, so we can all see the same
data? ... I picked some random pieces because I assumed it'd scale
close to linear. I'm still pretty surprised by 20x differences.

> I'm personally not even concerned with storing the data in sqlite

 Then we probably have little to discuss as downloading .sqlite
instead of .xml is a major win, and moving to generating it is the
plan for everyone AFAIK (and yum always prefers it). So anything that
doesn't help .sqlite transfer isn't worth much.

> I will state I have been using 7z for the compression and not lzma from the
> SDK, 7z has much better results.

 Fair enough, I just used lzip on CentOS-5, as that was all that came
up for "yum search lzma" there. I'm by no means a compression expert,
just trying to get some usable real world data.

--
James Antill -- james@xxxxxxx
_______________________________________________
Yum mailing list
Yum@xxxxxxxxxxxxxxxxx
http://lists.baseurl.org/mailman/listinfo/yum

_______________________________________________
Yum mailing list
Yum@xxxxxxxxxxxxxxxxx
http://lists.baseurl.org/mailman/listinfo/yum

[Index of Archives]     [Fedora Users]     [Fedora Legacy List]     [Fedora Maintainers]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]

  Powered by Linux