Re: Proposal/Discussion: Turning parts of Git into libraries

demerphq <demerphq@xxxxxxxxx> · Sat, 18 Feb 2023 02:59:51 +0100

On Sat, 18 Feb 2023 at 00:24, Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> Emily Shaffer <nasamuffin@xxxxxxxxxx> writes:
>
> > Basically, if this effort turns out not to be fruitful as a whole, I'd
> > like for us to still have left a positive impact on the codebase.
> > ...
> > So what's next? Naturally, I'm looking forward to a spirited
> > discussion about this topic - I'd like to know which concerns haven't
> > been addressed and figure out whether we can find a way around them,
> > and generally build awareness of this effort with the community.
>
> On of the gravest concerns is that the devil is in the details.
>
> For example, "die() is inconvenient to callers, let's propagate
> errors up the callchain" is an easy thing to say, but it would take
> much more than "let's propagate errors up" to libify something like
> check_connected() to do the same thing without spawning a separate
> process that is expected to exit with failure.

What does "propagate errors up the callchain" mean?  One
interpretation I can think of seems quite horrible, but another seems
quite doable and reasonable and likely not even very invasive of the
existing code:

You can use setjmp/longjmp to implement a form of "try", so that
errors dont have to be *explicitly* returned *in* the call chain. And
you could probably do so without changing very much of the existing
code at all, and maintain a high level of conceptual alignment with
the current code strategy.

To do this you need to set up a globally available linked list of
jmp_env data (see `man setjmp` for jmp_env), and a global error
object, and make the existing "die" functions populate the global
error object, and then pop the most recent jmp_env data and longjmp to
it.

At the top of any git invocation you would set up the topmost jmp_env
"frame". Any code that wants to "try" existing logic pushes a new
jmp_env (using a wrapper around setjmp), and prepares to be longjmp'ed
to. If the code does not die then it pops the jmp_env it just pushed
and returns as normal, if it is longjmp'ed to you can detect this and
do some other behavior to handle the exception (by reading the global
error object). If the code that died *really* wants to exit, then it
returns the appropriate code as part of the longjmp, and the try
handler longjmps again propagating up the chain. Eventually you either
have an error that "propagates to the top" which results in an exit
with an appropriate error message, or you have an error that is
trapped and the code does something else, and then eventually returns
normally.

FWIW, this is essentially a loose description of how Perl handles the
execution part of "eval" and supports exception handling internally.
Most of the perl internals do not know anything about exceptions, they
just call functions similar to gits die functions if they need to,
which then call into Perl_die_unwind(). which then calls the
JUMPENV_JUMP() macro which does the "pop and longjmp" dance.

Seems to me that it wouldn't be very difficult nor particularly
invasive to implement this in git. Much of the logic in the perl
project to do this is at the top of cop.h,  see the macros
JMPENV_PUSH(), JMPENV_POP(), JMPENV_JUMP(). Obviously this code
contains a bunch of perl specific logic, but the general gist of it
should be easily understood and easily converted to a more git like
context:

struct jmpenv: https://github.com/Perl/perl5/blob/blead/cop.h#L32
JMPENV_BOOTSTRAP: https://github.com/Perl/perl5/blob/blead/cop.h#L66
JMPENV_PUSH: https://github.com/Perl/perl5/blob/blead/cop.h#L113
JMPENV_POP: https://github.com/Perl/perl5/blob/blead/cop.h#L147
JMPENV_JUMP: https://github.com/Perl/perl5/blob/blead/cop.h#L159

Perl_die_unwind: https://github.com/Perl/perl5/blob/blead/pp_ctl.c#L1741
Where Perl_die_unwind() calls JMPENV_JUMP:
https://github.com/Perl/perl5/blob/blead/pp_ctl.c#L1865

You can also grep for functions of the form S_try_ in the perl code
base to find examples where the C code explicitly sets up an "eval
frame" to interoperate with the functionality above.

git grep -nP '^S_try_'
pp_ctl.c:3548:S_try_yyparse(pTHX_ int gramtype, OP *caller_op)
pp_ctl.c:3604:S_try_run_unitcheck(pTHX_ OP* caller_op)
pp_sys.c:3120:S_try_amagic_ftest(pTHX_ char chr) {

Seems to me that this gives enough prior art to convert git to use the
same strategy, and that doing so would not actually be that big a
change to the existing code.  Both environments are fairly similar if
you look at them from the right perspective. Both are C, and both have
a lot of global state, and both have lots of functions which you
really dont want to have to change to understand about exception
objects..

Here is an example of how a C function might be written to use this
kind of infrastructure to "try" functionality that might call die. In
this case there is no need for the code to inspect the global error
object, but the basic pattern is consistent. The "default" case below
handles the situation where the "tried" function is signalling an
"untrappable error" that needs to be rethrown to ultimately unwind the
entire try/catch chain and exit the program. It is derived and
simplified from S_try_yyparse mentioned above. This function handles
the "compile the code" part of an `eval EXPR`, and traps exceptions
from the parser so that they can be handled properly and distinctly
from errors trapped during execution of the compiled code. [ I am
assuming that given the historical relationship between git and perl
these concepts are not alien to everybody on this list. ]

/* S_try_yyparse():
 *
 * Run yyparse() in a setjmp wrapper. Returns:
 *   0: yyparse() successful
 *   1: yyparse() failed
 *   3: yyparse() died
 *
 * ...
 */
STATIC int
S_try_yyparse(pTHX_ int gramtype, ...)
{
    dJMPENV;

    JMPENV_PUSH(ret);
    switch (ret) {
    case 0:
        ret = yyparse(gramtype) ? 1 : 0;
        break;
    case 3:
        /* yyparse() died and we trapped the error. */
        ....
        break;
    default:
        JMPENV_POP;          /* remove our own setjmp data */
        JMPENV_JUMP(ret); /* RETHROW */
    }
    JMPENV_POP;
    return ret;
}

-- 
perl -Mre=debug -e "/just|another|perl|hacker/"