On Sat, 26 May 2007, Dana How wrote: > I think there are two interesting strategies compatible > with maximally-informative timestamps: > > (1) git-repack -a -d repacks everything on each call. You would need: > (1a) Rewrite builtin-pack-objects.c so only the object_ix hash > accesses the "objects" array directly, everything else > goes through a pointer table. > (1b) Sort the new pointer table by object type, in order > tag -> commit -> tree -> nice blob -> naughty blob. > The sort is stable so the order within each group is unchanged. This is not a good idea in general for runtime access to the pack. If you consider a checkout, the commit object is looked up, then the root tree object, then each tree entry is recursively looked up. Right now the way the objects are laid out, the most recent commit will have all its objects contiguously found in the pack and in the right order (that means tree and blobs mixed up). This gets less and less true as you go back into history, but at least the recent stuff has a really nice access pattern. Because commit objects are so fundamental to many graph operations they are already all packed together. But tree and blob objects are intermixed for the reason stated above. The naughty blob is a really special category and I think they should be treated as such. Therefore I don't think the common/normal case should be impacted with a generic change for something that is still a special case. In other words, I think the naughty blob could simply be recognized as such and be referenced in a special list instead of being written out initially. Then when everything is believed to be written, the special list can be walked to force write those naughty blob at last. No need to modify the current object order. Nicolas - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html