[Yum] Idea about lessening the footprint.

skvidal@xxxxxxxxxxxx (seth vidal) · 03 Mar 2003 21:21:11 -0500

> How about we provide a way to limit the scope of which packages out of a
> repository get considered. Most of the work is already done for us,
> since comps.xml has pretty self-contained groups with all dependencies
> already resolved for us (more or less). If there is a way to tell yum
> "only consider packages from certain groups and ignore the rest," the
> footprint will go down quite significantly, since fewer packages will
> need to be loaded into memory for dependency resolution.

Memory use is a concern of mine, as well.

ok so here are the problems I'd foresee.

1. this doesn't ultimately help the size on say a yum update -
especially if going from distro version to distro version

2. comps.xml doesn't HAVE to have that dependency information for pkgs
on the tail end

3. it might not have those in the future - this according to jeremy

4. its not terribly trustworthy considering that info can be written by
anything and while we would limit the sets for use we'd limit it in an
arbitrary way not in a relatively smarter way.

5. The long term plan for memory-use reduction is probably what Jeff
Johnson has recommended. Specifically, rpm is going to lose the
rpm.addInstall(hdr, 'a') mode. This is what yum and anaconda use to load
a transaction set with information about what things are available. So
when I do a ts.check() on the transaction set it hands me back this list
of things it needs and what is in the transaction set in mode 'a' that
provides it. Fairly spiffy. Except you have to load all those headers.
That's heavy. So Jeff wants to kill addInstall in mode a. 
So we build up our own test-rpmdb out of the available headers. We build
our transaction set, adding stuff to be installed, updated and erased.
Then we ts.check() and we get a list of stuff we need.

we then bind into our test-rpmdb of available pkgs and run ts.dbMatch()
until we turn blue and/or resolve all the deps. We then repopulate the
transaction set and start over until we're done.

Now. This means no more loading all the headers into memory - yay. It
also means providing our own method for matching those requirements up
with what provides them.

Not as _yay_ but not too heinous either.
It will just require some work.

The trick is that if you're updating the world then all those headers
will be in the transaction set in mode 'u' or 'i' which means that the
size of yum will be the size of those headers in memory. There is no way
to beat _that_ part afaict. 

My goal is to do two things.

1. port yum to this new model as a testcase before Jeff pulls the
addInstall mode 'a' functionality out of rpm. I'd like to be ahead of
the curve at some point, not behind it :)

2. make it possible for a person to run something like "yum listdeps
foo" and have it list all the pkgs (installed or available) required by
the pkg foo. In other words if you run rpm -qp --requires pkgfilename -
I want each of those resolved to an rpm pkgname. That function is the
same as needed for #1 but it's easiest to optimize in the case of #2, I
think.

> For example, let's say I have a beowulf node, that only has base,
> beowulf-linux, and X. This allows me to ignore me about 2000 extra
> packages that are devel, gnome, kde, multimedia, etc, limiting the scope
> of the dependency resolution to about 400 packages altogether. I imagine
> that the memory footprint on such operation is going to be much, much
> lower.
> 

I think the time would be better served implementing the mechanism for
handling our own dependency completion to move away from the addInstall
'a' model early.
the xml mechanism is cool - but I'd like to stick closer to how rpm does
it if only b/c it will keep me from having to duplicate effort in code?

> What do you think? Is this possible, and how much pain would that be?

possible, sure. pain wise - fairly enormous I think - or at the very
least not the most painfully obvious implementation.

having said that there is no reason we can't do both ;) But since I have
plans in mind (as you well know) for the other implementation I
described I'm going to focus my attentions on that one first.

if someone would like to provide a working model for the one Icon
described, I would love to take a look at it.

-sv