On Apr 5, 2005 10:17 AM, Daniel Veillard <veillard@xxxxxxxxxx> wrote: > Is that worth adding yet another XML Parser package to the distribution > used by a single tool ? Yes, I believe it is. 2-3 times faster in my book is definitely "worth it." > Is there a compatibility layer to still use libxml2 ? I don't mean to come across as rude, but libxml2 has a very clunky non-pythonic API. I'd choose cElementTree if only because I don't have to use MethodNamesThatStartWithCapsAgainstAllConventions(). It also has sensible error reporting (i.e. not just segfault, which is not useful with python). In other words, cElementTree feels like a Python library, as opposed to libxml2, which is very obviously a set of bindings to a C API done as an afterthought. > If I remember correctly, the performance problem wasn't libxml2 itself > but the specific usage within yum, i.e. collecting the data, libxml2 by > itself is parsing the megabyte sized file in less than a tenth of a second. I believe it wasn't "within yum" it was "within python," specifically going from C strings to python strings, which took a lot of resources. That's all that matters to yum, since, well, it's written in python, and cElementTree outperformed libxml2 in our tests and resulted in much nicer code. I'm the one who did the testing and convincing, so all blame and hatemail should be aimed at me. > I'm surprized the solution ends up going to use a python specific library > instead of trying to find why the interface between libxml2 and yum generated > that problem. I don't remember you saying you would switch library as a result. cElementTree (part of python-elementtree) is not a python-specific library. It's a python interface to expat, and a very well-designed one. It has fewer features than libxml2, for sure, but it's far more pleasant to use in python. Kind regards, -- Konstantin Ryabitsev Zlotniks, INC