Re: git pull & git gc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 19, 2015 at 12:14:53AM -0400, Jeff King wrote:
> On Thu, Mar 19, 2015 at 11:01:17AM +0900, Mike Hommey wrote:
> 
> > > I don't think packing the unreachables is a good plan. They just end up
> > > accumulating then, and they never expire, because we keep refreshing
> > > their mtime at each pack (unless you pack them once and then leave them
> > > to expire, but then you end up with a large number of packs).
> > 
> > Note, sometimes I wish unreachables were packed. Recently, I ended up in
> > a situation where running gc created something like 3GB of data as per
> > du, because I suddenly had something like 600K unreachable objects, each
> > of them, as a loose object, taking at least 4K on disk. This made my
> > .git take 5GB instead of 2GB. That surely didn't feel like garbage
> > collection.
> 
> That's definitely a thing that happens, but it is a bit of a corner
> case. It's unusual to have such a large number of unreferenced objects
> all at once.
> 
> I don't suppose you happen to remember the details, but would a lower
> expiration time (e.g., 1 day or 1 hour) have made all of those objects
> go away? Or were they really from some extremely recent event (of
> course, "event" here might just have been "I did a full repack right
> before rewriting history" which would freshen the mtimes on everything
> in the pack).

Unfortunately, I don't know the exact details. But yes, I guess a lower
expiration time might have helped.

> Certainly the "loosening" behavior for unreachable objects has corner
> cases like this, and they suck when you hit one. Leaving the objects
> packed would be better, but IMHO is not a viable alternative unless
> somebody comes up with a plan for segregating the "old" objects in a way
> that they actually expire eventually, and don't just keep getting
> repacked and freshened over and over.

It sure is a corner case, otoh, when it happens, every single git
operation calls git gc --auto, which happily spends 5 minutes sucking
CPU to end up doing nothing in practice. And add more salt on the
injury if you are on battery

6700 loose objects seems easy to reach on a repo with 6M objects...

Mike
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]