On Wed, 2007-06-13 at 15:56 -0400, Jeremy Katz wrote: > On Wed, 2007-06-13 at 13:40 -0600, Jeffrey Law wrote: > > Using cProfile, it appears that we're calling buildPkgRefDict for each > > explicitly listed package -- at a cost of nearly a half-second per call > > (3GigHz P4). Clearly this gets to be rather expensive when the package > > list is long -- a typical install is over 600 packages. We're burning > > an absurd amount of time here. > > > > Using @group syntax does not suffer from this problem. So clearly > > there's a path through anaconda which does not need to call > > buildPkgRefDict so often. > > The difference is that the comps file isn't allowed to do anything more > than list an exact package name. Listing packages in %packages is > allowed to be globs, specify version, specify, arch, etc. Understood -- and I certainly need to fully specify versioning information and the like (though I don't use globbing). > > > Although I'm not quite sure why we're not using the matchPackageNames() > bits in yum's install() method... it should provide the same sort of > results but also be able to do some of the querying using sql queries > against the sqlite db (and thus probably be faster) Good question (took me a few minutes to find a user of matchPackageNames but eventually I found it). Yes, it looks like they provide the same sort of information. Something like this? *** __init__.py 2007-06-13 14:28:52.000000000 -0600 --- __init__.py.NEW 2007-06-13 14:26:19.000000000 -0600 *************** class YumBase(depsolve.Depsolve): *** 1756,1762 **** if kwargs.has_key('pattern'): exactmatch, matched, unmatched = \ ! parsePackages(self.pkgSack.returnPackages(),[kwargs['pattern']] , casematch=1) pkgs.extend(exactmatch) pkgs.extend(matched) --- 1756,1762 ---- if kwargs.has_key('pattern'): exactmatch, matched, unmatched = \ ! self.pkgSack.matchPackageNames([kwargs['pattern']]) pkgs.extend(exactmatch) pkgs.extend(matched) I'm not sure what to do with the casematch argument or if the other calls to parsePackages ought to be changed too. And I'm certainly not well-versed in python, anaconda or yum to know if there are other issues I'm not dealing with. For a small package set (~350) packages that little tweak takes us from 9:20 to 7:34 (wall clock, start to finish). For reference using @groups to install the exact same packages is 6:52. So the change makes explicit packages almost competitive with @groups. Jeff