Nicolas Pitre <nico@xxxxxxx> wrote: > On Sat, 8 Sep 2007, Junio C Hamano wrote: > > This actually was meant to be used to sort object entries from > > multiple packs together. The update to pack-objects you are > > commenting on deals with one packfile at a time, but I think we > > probably should collect from all packs and then sort (which was > > how merge-pack used this function). > > I'm not sure sorting objects from multiple packs together like that is > going to help deltification. It is unlikely that related objects (e.g.. > objects having the same path) will be located at the same offset in > different packs. Yes. But when you are merging several packfiles together and you don't supply `--no-delta-reuse` then we're really just going to copy the data from the sources to the output. There is not a lot of deltification to be performed; maybe only a handful of loose objects will need to locate deltas. So helping deltification is not really of concern here. What Junio is trying to do here is at least preserve their order within the packfile as that should help to preserve their locality of access. Only I'm not sure that's the best merging strategy available to us. What about something like this: 1) Read all packfile indexes, sort by offset. 2) Locate first commit object within each packfile. 3) Get that commit's commit date; if no commit is in the packfile at all use the modification date of the packfile. 4) Sort the packfiles by their chosen date descending (more recent items are closer to the front of the list). 5) Add objects: foreach type in commit tree blob tag foreach packfile in sorted_packs_from_4 while current_object->type == $type if (current_object->flags & ADDED) == 0 add current_object current_object++ This way data is still organized by the original order that rev-list gave us when we created the small packfiles, but we also try to place data from more recent packfiles into the front of the new packfile. Its a rough approximation of what rev-list would have given us for object ordering when it performed a traversal. Its also a whole lot cheaper than rev-list and lets us continue to include unreachable objects, which was the point of this patch. -- Shawn. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html