Shawn Pearce <spearce@xxxxxxxxxxx> writes: > Junio C Hamano <junkio@xxxxxxx> wrote: > >> The list is sorted by type then hash then size (type_size_sort), >> so if you have t/Makefile that are big medium small too-small >> and then doc/Makefile that are big medium, once you see the >> too-small t/Makefile it would not consider the big doc/Makefile >> as a candidate X-<. >> >> This _might_ improve things: >>... > > Holy cow, it did: > > Total 46391, written 46391 (delta 8391), reused 37774 (delta 0) > 46M pack-7f766f5af5547554bacb28c0294bd562589dc5e7.pack > > That's the smallest packing I've seen yet. And it doesn't have a > negative affect on repacking GIT either. Thanks for trying. Mind trying one more? I suspect the test patch makes pack-objects a lot more expensive. The code before the test patch said "if the size is very small or size difference is too great, do not consider this, and do not consider any more objects in the delta window, because we know they are either even smaller of the same path, they have different names, or they are of different type". The test patch you tried was a quick and dirty hack that said "under the too-small condition, skip this one, but keep trying the rest of the delta window". Here is a cleaned up patch. What it does is "under the too-small condition, see if the object has the same basename, and if so keep going, but otherwise skip the rest as before". If you have objects like this and are trying to pack the first object (this list is sorted in the order pack-object tries): (size) (path) 1000 t/0-11-AdjLite.deg 10 t/0-11-AdjLite.deg 800 s/0-11-AdjLite.deg 20 t/0-12-AdjLite.deg the current code stops after checking t/0-11-AdjLite.deg. The test patch tries all of them. This patch skips that file, but tries "s/0-11-AdjLite.deg", and then stops at the next one. -- >8 -- diff --git a/pack-objects.c b/pack-objects.c index 09f4f2c..2173709 100644 --- a/pack-objects.c +++ b/pack-objects.c @@ -1036,8 +1036,6 @@ static int try_delta(struct unpacked *cu oldsize = old_entry->size; sizediff = oldsize > size ? oldsize - size : size - oldsize; - if (size < 50) - return -1; if (old_entry->depth >= max_depth) return 0; @@ -1048,20 +1046,27 @@ static int try_delta(struct unpacked *cu * more space-efficient (deletes don't have to say _what_ they * delete). */ - max_size = size / 2 - 20; - if (cur_entry->delta) - max_size = cur_entry->delta_size-1; - if (sizediff >= max_size) - return -1; - delta_buf = diff_delta(old->data, oldsize, - cur->data, size, &delta_size, max_size); - if (!delta_buf) + if (50 <= size) { + max_size = size / 2 - 20; + if (cur_entry->delta) + max_size = cur_entry->delta_size-1; + if (sizediff < max_size) { + delta_buf = diff_delta(old->data, oldsize, + cur->data, size, + &delta_size, max_size); + if (!delta_buf) + return 0; + cur_entry->delta = old_entry; + cur_entry->delta_size = delta_size; + cur_entry->depth = old_entry->depth + 1; + free(delta_buf); + return 0; + } + } + /* Keep going as long as the basename matches */ + if (((cur_entry->hash ^ old_entry->hash) >>DIRBITS) == 0) return 0; - cur_entry->delta = old_entry; - cur_entry->delta_size = delta_size; - cur_entry->depth = old_entry->depth + 1; - free(delta_buf); - return 0; + return -1; } static void progress_interval(int signum) - : send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html