[PATCH] pack-objects: never deltify objects bigger than window_memory_limit.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



With very large objects, just loading them into the delta window wastes a
huge amount of memory.  In one repo, I have some objects around 1GB in size,
and git-pack-objects seems to require about 8x that in order to deltify it,
even when the window memory limit is small (eg. --window-memory=100M).  With
this patch, the maximum memory usage is about halved.

Perhaps more importantly, however, disabling deltification for large objects
seems to reduce memory thrashing when you can't fit multiple large objects
into physical RAM at once.  It seems to be the difference between "never
finishes" and "finishes eventually" for me.

Test:

I created a test repo with 10 sequential commits containing a bunch of
nearly-identical 110MB files (just appending a line each time).

Without this patch:

    $ /usr/bin/time git repack -a --window-memory=100M

    Counting objects: 43, done.
    warning: suboptimal pack - out of memory
    Compressing objects: 100% (43/43), done.
    Writing objects: 100% (43/43), done.
    Total 43 (delta 14), reused 0 (delta 0)
    42.79user 1.07system 0:44.53elapsed 98%CPU (0avgtext+0avgdata
      866736maxresident)k
      0inputs+2752outputs (0major+718471minor)pagefaults 0swaps

With this patch:

    $ /usr/bin/time -a git repack -a --window-memory=100M

    Counting objects: 43, done.
    Compressing objects: 100% (30/30), done.
    Writing objects: 100% (43/43), done.
    Total 43 (delta 14), reused 0 (delta 0)
    35.86user 0.65system 0:36.30elapsed 100%CPU (0avgtext+0avgdata
      437568maxresident)k
      0inputs+2768outputs (0major+366137minor)pagefaults 0swaps

It apparently still uses about 4x the memory of the largest object, which is
about twice as good as before, though still kind of awful.  (Ideally, we
wouldn't even load the entire large object into memory even once.)

Signed-off-by: Avery Pennarun <apenwarr@xxxxxxxxx>
---
 builtin/pack-objects.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 0e81673..9f1a289 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1791,6 +1791,9 @@ static void prepare_pack(int window, int depth)
 		if (entry->size < 50)
 			continue;
 
+		if (window_memory_limit && entry->size > window_memory_limit)
+                	continue;
+
 		if (entry->no_try_delta)
 			continue;
 
-- 
1.7.3.1.gca9d1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]