On Wed, 10 Apr 2013 10:33:43 -0500 Chris Adams <cmadams@xxxxxxxxxx> wrote: > The metadata starts in XML before being loaded into an SQLite DB file, > and the XML is in the repodata directory with the DB. However, both > are compressed, as they are large. For example, the current > updates/18/x86_64 XML is over 34M (5M gzip compressed), and the DB is > 41M (9M bzip2 compressed). I'm guessing there are historical reasons > why different compression is used; both could be made noticeably > smaller with xz (XML to just over 3M, DB to 7M), but that's still a > lot of data to download (and there are also other metadata files that > have to be downloaded sometimes, especially the filelists.xml.gz, > which is 10M gzip compressed). > > I'm not sure when the XML is downloaded instead of (or in addition to) > the DB, but it does appear to happen (I see one example in my mirror > server web logs this morning for example). > Here's how it works. the xml metadata put together over a decade ago. It is the canonical representation of the metadata. The sqlite was added maybe 8ish years ago as a way of more quickly reading the same data and not eating up so much memory. At the time bzip2 was the new hotness so we used it instead of gz. the primary, filelists and other xml should not ever be downloaded at this point unless you hit a mirror which is out of sync, badly. the only xml files that should be getting downloaded: 1. repomd.xml - it's fairly small and the index for everything else 2. comps.xml (or groups.xml) - which is where comps is stored per-repo 3. updatemd.xml which is just the security/update info for describing updates yum will grab repomd.xml and look to see if it is newer than what it has already. Then go from there about updating the rest of the metadata. Hope that helps explain it a bit more. -sv -- devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/devel