On Tue, Oct 05 2021, Mike Hommey wrote: > Hi, > > Sorry for the delay, I managed to miss this reply. > > On Tue, Sep 28, 2021 at 04:56:37AM +0200, Ævar Arnfjörð Bjarmason wrote: >> >> On Tue, Sep 28 2021, Mike Hommey wrote: >> >> > When a remote-helper fails in a way that is not directly visible in the >> > remote-helper protocol, the helper failure is ignored by git during >> > fetch or clone. >> > >> > For example, a helper cannot directly report an error during an `import` >> > command (short of sending `feature done` to the fast-import file >> > descriptor and not sending a `done` later on). >> > >> > Or if the helper crashes at the wrong moment, git doesn't notice and >> > thinks everything went well. >> > >> > Signed-off-by: Mike Hommey <mh@xxxxxxxxxxxx> >> > --- >> > builtin/clone.c | 5 +++-- >> > builtin/fetch.c | 6 +++--- >> > 2 files changed, 6 insertions(+), 5 deletions(-) >> > >> > What I'm not sure about is whether a message should be explicitly >> > printed by git itself in those cases. >> > >> > diff --git a/builtin/clone.c b/builtin/clone.c >> > index 66fe66679c..f26fa027c5 100644 >> > --- a/builtin/clone.c >> > +++ b/builtin/clone.c >> > @@ -1398,7 +1398,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix) >> > submodule_progress = transport->progress; >> > >> > transport_unlock_pack(transport); >> > - transport_disconnect(transport); >> > + err = transport_disconnect(transport); >> > >> > if (option_dissociate) { >> > close_object_store(the_repository->objects); >> > @@ -1406,7 +1406,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix) >> > } >> > >> > junk_mode = JUNK_LEAVE_REPO; >> > - err = checkout(submodule_progress); >> > + if (!err) >> > + err = checkout(submodule_progress); >> > >> > free(remote_name); >> > strbuf_release(&reflog_msg); >> >> This seems buggy in some cases, e.g. just because we couldn't close() >> some final socket we should just not run the checkout at all? Shouldn't >> we note the disconnect status, but still see if we can do the checkout >> etc? > > I guess it could be seen both ways. In my particular use case, I do want > automated testing to be able to catch Leak Sanitizer failures from a > remote-helper. I'd like that too, but I'm pointing out something unrelated (or at least I think I am), which is that this is changing the equivalent of this code: int fd = open(...); close(fd); do_stuff(); To be: int err; int fd = open(...); /* check err here too, but whatever, pseuodocode() */ if (!close(fd)) do_stuff(); But just with sockets, i.e. just because transport_disconnect() had *some* error are we really confident that checkout() should not be run? *Maybe*, but at the very least shouldn't we distinguish some of those error states? Isn't any serious protocol error going to be caught earlier transport_disconnect() is just going to give us the equivalent of not being able to cleanly close the socket? Except as noted in my second comment it doesn't, since we ignore disconnect_git(). At this point I can't recall what we do check, and I intentionally haven't re-looked, but that's also a suggestion for your commit message :) I.e. what error states are going to be different now, when don't we proceed because of a disconnect_git() failure? >> > diff --git a/builtin/fetch.c b/builtin/fetch.c >> > index 25740c13df..66bccf6f50 100644 >> > --- a/builtin/fetch.c >> > +++ b/builtin/fetch.c >> > @@ -1886,7 +1886,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv, >> > { >> > struct refspec rs = REFSPEC_INIT_FETCH; >> > int i; >> > - int exit_code; >> > + int exit_code, disconnect_code; >> > int maybe_prune_tags; >> > int remote_via_config = remote_is_configured(remote, 0); >> > >> >> >> > @@ -1952,9 +1952,9 @@ static int fetch_one(struct remote *remote, int argc, const char **argv, >> > exit_code = do_fetch(gtransport, &rs); >> > sigchain_pop(SIGPIPE); >> > refspec_clear(&rs); >> > - transport_disconnect(gtransport); >> > + disconnect_code = transport_disconnect(gtransport); >> > gtransport = NULL; >> > - return exit_code; >> > + return exit_code || disconnect_code; >> > } >> > >> > int cmd_fetch(int argc, const char **argv, const char *prefix) >> >> This seems like it really needs fixes in other areas, >> i.e. disconnect_git() returns 0 unconditionally, but should check at >> least finish_connect(), no? >> >> Also once that's done you'll have a logic error here where you're >> conflating exit and error codes, we should not return -1 from main() (we >> do in a few cases, but it's nasty)> > > transport_disconnect returns the remote-helper exit code for external > helpers, though, but I guess we don't necessarily need to return that > particular exit code. Right, some of these functions are quite naughty about that, and as noted we get it wrong in a lot of existing cases. But it is unportable (non-standard C, but POSIX plasters over it), i.e. a return of -1 from main() is interpreted as a wait() status in POSIX, and results in a return of -1 & 0xff == 255. But for new code let's try to get it right, which in this case is going to be transport_disconnect() being being a syscall error code, and the "exit code" being a return code, which should not be mixed, no?