Re: RFC: error codes on exit

Junio C Hamano <gitster@xxxxxxxxx> · Thu, 20 May 2021 09:49:20 +0900

Jonathan Nieder <jrnieder@xxxxxxxxx> writes:

> One kind of signal we haven't been able to make good use of is error
> rates.  The problem is that a die() call can be an indication of
>
>  a. the user asked to do something that isn't sensible, and we kindly
>     rebuked the user
> ...
>  e. we encountered an internal error in handling the user's
>     legitimate request
>
> and these different cases do not all motivate the same response.
> ...
> In order to do this, I would like to annotate "exit" events with a
> classification of the error.

We already have BUG() for e. and die() for everything else, and
"everything else" may be overly broad for your purpose.

I am sympathetic to the cause and I agree that introducing a
finer-grained classification might be a solution.  I however am not
sure how we can enforce developers to apply such a manually assigned
"error code" cosistently.

Just to throw in a totally different alternative to see if it works
better, I wonder if you can teach die() to report to the trace2
stream where in the code it was called from and which vintage of Git
it is running.

The stat collection side that cares about certain class of failures
can have function that maps "die() at <filename>:<lineno>@<version>"
to "what kind of die() it is".  

E.g.  blame.c:50@v2.32.0-rc0-184-gbbde7e6616" may be BUG(), while
blame.c:2740@v2.32.0-rc0-184-gbbde7e6616 may be an user-error.

That way, our developers do not have to do anything special and
cannot do anything to screw up the classification.