Sam Varshavchik wrote: > I'd break it down as about 70% yum vs. 30% rpm. Yum is really > taking its sweet time figuring out what it needs to do. But even > after it's done that, and downloaded everything, rpm still tends to > spin its wheels, if it has a large list of packages to chew through. Okay. That sounds reasonable. It was my (armchair) impression that it was mostly yum. So when I say it sounds reasonable, what I really mean is that it agrees with my own bias. ;) > You do /not/ need that much info in the first step. All you need is > a just a list of names of packages available on the remote > repository. You reconcile that against the list of packages you > already have downloaded the metadata for, and you then know what's > new. I agree that the amount of data downloaded could be less. What I meant was that at some point you have to chew on a large chunk of data to do the depsolving. I'm surprised that it was done in python for as long as it was. That seems like a task much more suited to a compiled language. I didn't do any speed tests to compare yum before an after the metadata parser was rewritten in C. > Meanwhile, primary.xml.gz is actually a voluminous XML file that > contains not just each package's name and version, but also all > sorts of extra info. And you have to download the whole thing every > time. And, the current version of yum, sqlite-based, does not help. > I see that primary.sqlite.bz2 is about twice as large as > primary.xml.gz. > > So, all this talk of a database-based yum, and it turns out that you > end up having to download /twice/ as much data as you used to > before? Someone explain to me what we're supposed to be doing here. Yeah, that's why there's interest in Presto. I *think* that it uses deltas for the metadata as well as the packages, but I'm really not sure of that. With the size of the metadata, it would be quite an improvement if it did. > From what I see yum is doing, it download the primary, the other > file, and possibly filelists, /every/ time a single package gets > added to the repository. Even though 99% of the content is the same > as before. > > This, in my opinion, does not really such an optimum design to me. > You should /not/ have to download /everything/ every time a single > package changes. Agreed. I rsync things nightly, so it's always local for me and I don't spend much time worrying about it. But there is a lot of room for improvement. I'm sure there aren't enough people interested in doing the hard work to make it happen though, so improvement will be slow. > Ditto for the epoch hack -- my solution fixes the original > underlying reason for having an epoch in the first place. Eh? So how do you handle the sometimes retarded versioning schemes of upstream sources? Or the occasional need to push an older version of something as an update? > Well, I can point them to how HTTP 1.1 chunking works, and how to > gracefully autodetect if the HTTP server supports HTTP 1.1 chunking, > and the logic to gracefully fall back to "Plan B", if the > repository's HTTP server is running old Apache without HTTP 1.1 > support, and what to do next. That's about all I can do. I won't > write the code, I have plenty of other coding work that keeps me > busy. If you have some time, writing up some of your ideas and how they could be worked into the current infrastructure would seem like a nice way to help out. Any sort of volunteer effort usually suffers from lack of manpower more than anything else. > It's not that trademarked logos must be kept in one package. It's > just that the package, for some reason that I still can't fathom, > must depend on gtk2 code libraries. Why would a package that > supposedly contain nothing more than a bunch of logo image files, > have a needed dependency on a package that contains system > libraries? That just does not compute. It was due to the package guideline of not having unowned directories. The dep chain needed to pull in the packages that owned the directories it was adding files to. The fix in rawhide was to simply have fedora-logos own those directories as well as the redhat-artwork package. > Although this does not have any direct relevance to the overall > issue of rpm's design, it is demonstrative, though, of the same kind > of inefficient non-attention to detals. It wasn't that this wasn't known. It was that there are different goals and policies at work. Sometimes that causes a conflict. Is it more important to not have unowned directories on the system or to have a super small install? Different people have different answers to that question. A lot of people moaned about this one, but not so many proposed an acceptable solution. Sadly, that's what happens far too often. -- Todd OpenPGP -> KeyID: 0xBEAF0CE3 | URL: www.pobox.com/~tmz/pgp ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ I got stopped by a cop the other day. He said, "Why'd you run that stop sign?" I said, "Because I don't believe everything I read." -- Stephen Wright
Attachment:
pgpJ66Oipf7WE.pgp
Description: PGP signature
-- fedora-list mailing list fedora-list@xxxxxxxxxx To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list