Re: [PATCH] gc: do not warn about too many loose objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ævar,

Ævar Arnfjörð Bjarmason wrote:
> On Tue, Jul 17 2018, Jonathan Nieder wrote:

>> That suggests a possible improvement.  If all callers should be
>> ignoring the exit code, can we make the exit code in daemonize mode
>> unconditionally zero in this kind of case?
>
> That doesn't make sense to me. Just because git itself is happily
> ignoring the exit code I don't think that should mean there shouldn't be
> a meaningful exit code. Why don't you just ignore the exit code in the
> repo tool as well?

I don't maintain the repo tool.  I am just trying to improve git to
make some users' lives better.

repo worked fine for years, until gc's daemonized mode broke it.  I
don't think it makes sense for us to declare that it has been holding
git gc wrong for all that time before then.

> Now e.g. automation I have to see if git-gc ---auto is having issues
> can't just be 'git gc --auto || echo fail @ {hostname}' across a fleet
> of servers, but will need to start caring if stderr was emitted to.

Thanks for bringing this up.  The thing is, the current exit code is
not useful for that purpose.  It doesn't report a failure until the
*next* `git gc --auto` run, when it is too late to be useful.

See the commit message on the proposed patch
https://public-inbox.org/git/20180717065740.GD177907@xxxxxxxxxxxxxxxxxxxxxxxxx/
for more on that subject.

> I don't care if we e.g. have a 'git gc --auto --exit-code' similar to
> what git-diff does, but it doesn't make sense to me that we *know* we
> can't background the gc due to a previous error and then always return
> 0. Having to parse STDERR to see if it *really* failed is un-unixy,
> let's use exit codes. That's what they're for.

Do you know of anyone trying to use the exit code from gc --auto in
this way?

It sounds to me like for what you're proposing, a separate

	git gc --auto --last-run-failed

command that a tool could use to check the outcome from the previous
gc --auto run without triggering a new one would be best.

[...]
> I think you're conflating two things here in a way that (if I'm reading
> this correctly) produces a pretty bad regression for users.

The patch doesn't touch the "ratelimiting" behavior at all, so I'm not
sure what I'm doing to regress users.

[...]
>> To solve (3), we could
>> introduce a gc.lastrun file that is touched whenever "gc --auto" runs
>> successfully and make "gc --auto" a no-op for a while after each run.
>
> This would work around my concern with b) above in most cases by
> introducing a purely time-based rate limit, but I'm uneasy about this
> change in git-gc semantics.
[..]
> With these proposed semantics we'd skip a needed GC (potentially for
> days, depending on the default) just because we recently ran one.
>
> In many environments, such as on busy servers, we could have tens of
> thousands of packs by this point, since this facility would (presumably)
> bypass both gc.autoPackLimit and gc.auto in favor of only running gc at
> a maximum of every N minutes, similarly in a local checkout I could have
> a crapload of loose objects because I ran a big rebase or a
> filter-branch on one of my branches.

I think we all agree that getting rid of the hack that 'explodes'
unreachable objects into loose objects is the best way forward, long
term.

Even in that future, some ratelimiting may be useful, though.  I also
suspect that we're never going to achieve a perfect set of defaults
that works well for both humans and servers, though we can try.

Thanks and hope that helps,
Jonathan



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux