sorting yum/dnf metadata and metadata diffs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



How feasible would it be to keep the listings in primary.xml and filelists.xml sorted by package name and arch? Doing so could open the door to simple and efficient diffs of repository metadata.

I recently ran some quick tests using python and elementtree. While the F21 primary.xml files from 2/7 and 2/9 both weigh around 2.6M compressed and ~18M uncompressed, sorting them and running a simple line-by-line comparison revealed a diff of ~500K, which compressed down to ~70K. A similar procedure on the 8M filelists.xml yielded a diff which compressed to ~200K.

Those two are by far the largest metadata files. If the observed improvements are typical, then keeping those files in order and hosting the diffs between the present and the previous few days (and modifying dnf to look for those diffs) could substantially reduce the amount of data that users must download every time a repository is updated, which for a fast-moving OS like Fedora could happen nearly every day.
-- 
devel mailing list
devel@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]
  Powered by Linux