On Thu, Dec 30, 2004 at 10:50:26PM -0800, Jamie Zawinski wrote:
Sean Middleditch wrote:
The problem you're perceiving (slow operation as yum starts up) isn't at all due to lack of caching, but perhaps very inefficient handling of the cache - a lot of data has to be parsed and such, when it could perhaps be stored in a more ready-to-process format.
It takes *nearly a minute* to do that! I'm on a 2GHz machine. If it's not hitting the net, what's it doing, raytracing?
Parsing the XML file and building the associated Python objects.
And before bashing XML and the cost of parsing, it's only a very small
fraction of the time spent, building the Python strings and objects is
the really costly part as we found with seth when doing basic tests.
My own test led me to believe that python string interning (take a string from the C layer or XML and get the copy from Python own string
implementation) is extremely costly, and of course we are manipulating
an very large amount of strings when collecting the repodata.
have you already made some real mesurement? than wouldn't be useful to implement that small portion in C? or it isn't so small part?
-- Levente "Si vis pacem para bellum!"