On Jan 31, 2005, Jeff Pitman <symbiont@xxxxxxxxxx> wrote: > This could be driven by an optional parameter to createrepo, which > provides a list of packages to create a delta with. Err... Why? We already have repodata/, and we're creating the new version in .repodata. We can use repodata/ however we like, I think. > If it were fully automatic, it would only be a download win for the > user. And the servers. > I would rather not utilize xdelta, because you're still regenerating the > entire thing. Having xmlets that virtually add/substract as a delta > against primary.xml.gz would be optimal for both sides of the equation. But then Seth rejects the idea because it makes for unmaintainable code. And I sort of agree with him now that I see a simpler way to accomplish the same bandwidth savings. > Another advantage of the delta method, is that the on-disk pickled > objects (or whatever back-end store is used) could be updated > incrementally based on xml snippets coming in. Instead of regenerating > the whole thing over again. This is certainly a good point, but it is also trickier to get right. And it might also turn out to be bigger: if you have to list what went away, you're probably emitting more information than xdelta's `skip these many bytes'. It's like comparing diff with xdelta: diff is reversible because it contains what was removed and what as added (plus optional context), whereas xdelta only contains what was inserted and what portions of the original remained. Getting inserts small is trivial; getting removals small might be trickier, and to take advantage of pickling we need the latter. Unless... Does anyone feel like implementing an xml-aware xdelta-like program in Python? :-) -- Alexandre Oliva http://www.ic.unicamp.br/~oliva/ Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org} Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org}