Re: People unaware of the importance of "git gc"?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Thu, 6 Sep 2007, Junio C Hamano wrote:
> 
> I thought the whole point of "gc --auto" was to have something
> that does not lose/prune any objects, even the ones that do not
> seem to be referenced from anywhere.  That is why invocations of
> "git gc --auto" do not say --prune as you saw the second patch,
> and the repack command "gc --auto" runs is "repack -d -l"
> instead of "repack -a -d -l", which means that it does run
> git-prune-packed after repacking but not git-prune.

I think "repack -d -l" should be ok from a safety perspective, but I'd 
also like to say that always running it incrementally is going to largely 
suck after a time.

IOW, if you get lots of small incrmental packs, after a while you really 
*do* need to do "git gc" to get the real pack generated.

In the case I saw, James really had hundreds of pack-files. That makes all 
our object lookups suck. Yes, not having loose objects at all is a big 
deal too, and yes, we try to start from the last pack-file we found (for 
the locality that we hope is there), but it's still pretty bad from a 
cache usage standpoint, and when we create a new object, we'll first 
search (in vain) in all the hundreds of pack-files.

So would "git gc --auto" have helped James? I'm sure it would have. But he 
already had lots of pack-files from doing "git fetch/pull", and while 
doing the "git gc --auto" will likely *delay* the point where you need to 
do a full repack, it doesn't make it go away.

We still need to tell people to do a full git gc at some point, or do it 
for them. And the longer you delay doing it, the more expensive it's going 
to get to do and/or the worse the final packing is going to be (especially 
if it ends up reusing non-optimal packing decisions from the smaller 
packs).

So I think the --auto stuff is still worth it, but it's really just 
pushing the pain somewhat further out.

(In the kernel community, if you fetch my tree daily, you really *are* 
going to have hundreds and hundreds of packfiles just from doing that).

So I'd really like us to also remind people to do a *real* and full "git 
gc", not just the incremental ones.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux