Re: git pull is slow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Johannes Schindelin <Johannes.Schindelin@xxxxxx> wrote:
> On Sat, 12 Jul 2008, Stephan Hennig wrote:
> > 
> > Thanks for having a look at this!  What does "problem with the pack" 
> > mean?  Do you think it is a Git problem (client or server side?) or just 
> > a misconfiguration?
> 
> I thought that the blobs in the pack are just too similar.  That makes for 
> a good compression, since you get many relatively small deltas.  But it 
> also makes for a lot of work to reconstruct the blobs.
> 
> I suspected that you run out of space for the cache holding some 
> reconstructed blobs (to prevent reconstructing all of them from scratch).
...
> Whoa. As you can see, your puny little 3.3 megabyte pack is blown to a 
> full 555 megabyte in RAM.
...
> I expect this to touch the resolve_delta() function of index-pack.c in a 
> major way, though.

Yea, that's going to be ugly.  The "cache" you speak of above is held
on the call stack as resolv_delta() recurses through the delta chain
to reconstruct objects and generate their SHA-1s.  There isn't a way to
release these objects when memory gets low so your worst case scenario
is a 100M+ blob with a delta chain of 50 or more - that will take you
5G of memory to pass through index-pack.

jgit isn't any better here.

What we need to do is maintain a list of the objects we are holding
on the call stack, and reduce ones up near the top when memory
gets low.  Then upon recursing back up we can just recreate the
object if we had to throw it out.  The higher up on the stack the
object is, the less likely we are to need it in the near future.

The more that I think about this, the easier it sounds to implement.
I may try to look at it a little later this evening.
 
> P.S.: It seems that "git verify-pack -v" only shows the sizes of the 
> deltas.  Might be interesting to some to show the unpacked _full_ size, 
> too.

It wouldn't be very difficult to get that unpacked size.  We just have
to deflate enough of the delta to see the delta header and obtain the
inflated object size from that.  Unfortunately there is not an API in
sha1_file.c to offer that information to callers.

-- 
Shawn.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux