On Sat, 29 Jul 2006, Linus Torvalds wrote: > > It's also very dangerous. > > If you have partial packing (which you can get if you fetch data using > rsync or http, for example), not havign the "--full" means that > git-fsck-objects will report on objects being "unreachable" if they are > only reachable from another object that is packed. > > Now, in practice, if you only use the git native protocol, this should > never happen, and you're fine. Side note: in _practice_, it probably doesn't happen even with rsync and http, so in that sense, it's true that "--full" is almost always likely to just be a waste of time, and I can't come up with a schenario where you really need "--full" for pruning unless you did something strange. All the normal workflows means that if you have an object that is in a pack, everything it points to will _also_ be in a pack, and as such, "git prune" would never remove anything that wasn't safe to remove, even without the "--full". But just to get an example of how a _strange_ schenario could happen, let's say that - you're tracking a upstreams repository using rsync or http (ie you will get the objects in the same format that upstream tracks them, either as individual objects, or as "packs") - that upstreams repository does _incremental_ repacks every once in a while. - the last time you fetched was _just_ before upstream did an incremental pack, we call this "State A". As a result, you now have his old state A all as individual objects in your object database. - you fetch again, now after upstream has done _two_ incremntal packs (one to pack all the loose objects that you already had, and one to pack the new state). Upstream is now at "State B" As a result, you get all of his _new_ objects as one nice pack: you do not get his other pack, because you already have all _those_ objects (which are "state A") as individual objects. - so now, since you're only tracking the other ends state, and have no objects of your own (in particular, the last fetch/pull did _not_ generate a merge object of your own to connect the new pack with the old objects), what has happened is that all your heads point into the new incremental pack you just fetched, and that pack itself will have pointers to the individual objects that you fetched last time, because it was an incremental pack to "state A". - what happens now is that if you run "git-fsck-objects" without the "--full", it will claim that _all_ of your unpacked objects are unreachable, because they really are reachable only though that new pack. So in this (very very unusual) circumstance, "git prune" without the "--full" would literally prune away objects that you very much need. I hope this explains why that "unnecessary" (and admittedly much more expensive) --full is there. It really is unnecessary in practice: partly because Junio has made "git repack -a -d" so efficient that doing incremental packs isn't even worth it for most people, and partly because you probably use the native git protocol and repack yourself, and thus never use another persons pack directly (which also avoids this problem). But yeah, the olf "git prune" was really very expensive. It's much better in the current git branch, although it's still not _cheap_ (because it does do the whole reachability analysis, though all pack-files, because it wants to get the above special case right). If we really wanted to, we could add a "core.fullpacks" flag that you could set, and that would cause the non-native protocols to not work (or alternatively force a re-pack after they have fetched a pack), and that would disallow incremental repacking locally, and then we could optimize the hell out of "git prune" and say that it never needs to look at any reachability for an object that is already packed. That would make "git prune" basically instantaneous, the way "git fsck-objects" is by default. But to be safe, it really needs to have some per-repository flag that is honored by the other git commands. Linus - : send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html