Re: People unaware of the importance of "git gc"?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Govind Salinas wrote:
This is one reason why I really think that gc should be *plumbing*
and *not* porcelain.

That's a good way to think of it IMO. It's a low-level operation (albeit one that encapsulates other, lower-level ones) that tells git to rearrange its internal data structures. It is not something that has any user-visible effect. Every other porcelain-level git command *does something* from the user's point of view. Running git-gc is basically a no-op, which from the user's point of view makes it a waste of keystrokes and an annoying distraction from focusing on the stuff they're using git to help them build.

The user should never have to trigger a gc, they should even be
discouraged from doing so.  That is how other gc systems are.  Can you
imagine if you had a Java app that had a button on it to do a gc?
When should I push it?  Should I wait till the system is getting slow
or just start spamming the button whenever I'm bored?  I know that
Java/c#/py GC are different than git gc, but they fulfill the same
basic purpose as git gc.  IE to clean up unused items and free up
resources.  Git additionally may do some re-optimization, but that is
not relevant to a user.

I'll play devil's advocate for a moment here, though, and say that, as others have suggested in this thread, git could be made to tell you when it's appropriate to run gc. So the "I don't know when to run it" argument isn't a hard one to address.

With that in mind, here's what the message should look like IMO:

---
Your repository can be optimized for better performance and lower disk usage. Please run "git gc" to optimize it now, or run "git config gc.auto true" to tell git to automatically optimize it in the future (this will launch processes in the
background.) For more information, "man git-gc".
---

And that "gc.auto" config option (just an arbitrary name, call it something else if that's no good) actually has four settings:

warn (the default) - prints the warning message, at most once every N minutes (we can determine a good value for N)
true - launches git-gc in the background as needed
false - suppresses the warning and the check that triggers the warning
foreground - launches git-gc in the foreground as needed (to make it easier to abort)


I don't buy the "git gc takes too much memory to run in the background" argument as a reason automatic git-gc is a bad idea. Many of us (me included) work on machines with plenty of memory to launch a background git-gc without hampering our development work, and/or on repositories small enough that it doesn't eat that much memory in the first place. And if you make it an option that the user has to enable, people on low-memory machines can simply not enable it, end of problem.

One big problem with git-gc now is that it's not discoverable. Or rather, the need for it isn't discoverable. So at the very least we should print the warning, IMO -- and if we're already going to all the trouble to determine whether or not git-gc needs to be run, it will reduce the "why are you telling me to run something when you could just do it for me, you stupid machine?" factor if there's an easily discoverable way to just do it as needed.

-Steve
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux