On Wed, 18 Oct 2006, Junio C Hamano wrote: > > Linus Torvalds <torvalds@xxxxxxxx> writes: > > > > Actually, I've hit an impasse. > > > > So there's _another_ way of fixing a thin pack: it's to expand the objects > > without a base into non-delta objects, and keeping the number of objects > > in the pack the same. But _again_, we don't actually know which ones to > > expand until it's too late. > > pack-objects.c::write_one() makes sure that we write out base > immediately after delta if we haven't written out its base yet, > so I suspect if you buffer one delta you should be Ok, no? It doesn't matter. I realized that my bogus patch to unpack-objects was more seriously broken anyway: even the "un-deltify every single object" was broken. And that's despite the fact that I _tested_ it, and verified the end result by hand. Why? Because I tested it within one repo, by just piping the output of git-pack-objects --stdout directly to the repacker. That seemed to be a good way to test it without setting up anything bigger. But it turns out that it misses one of the big problems: if you don't unpack the objects in a way that later phases can read, none of the streaming code works at all, and you have to buffer up _everything_ in memory just to be able to read any previous _non_delta objects too. So my patch-series works - but it only works in a repo that already has all the objects in question, because then it can look up the objects in the original database. Which makes it useless. Duh. So forget about unpack-objects. It's designed to be streaming (and it's a _good_ design for what it does), but repacking really cannot be done that way. Repacking needs to be done by saving the thin pack to disk, and then doing a multi-pass over it (like git-index-pack does, for example). Just throw my patch away. It's not even useful as a basis for anything else, unless you want to use it as a way to keep all the objects in memory and use the "unpack-objects" logic to just _parse_ the incoming pack. I suspect using "index-pack" is saner (since it already has the multi-pass logic), or just doing somethign that maps all the objects in memory, and then calls builtin-pack-objects once it has set up the new thin pack so that others can see/use the new objects without realizing that they aren't in the canonical pack-format. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html