On Sun, Apr 07, 2013 at 01:45:06AM -0600, Felipe Contreras wrote: > diff --git a/t/t5801-remote-helpers.sh b/t/t5801-remote-helpers.sh > index f387027..efe67e2 100755 > --- a/t/t5801-remote-helpers.sh > +++ b/t/t5801-remote-helpers.sh > @@ -166,4 +166,23 @@ test_expect_success 'push ref with existing object' ' > compare_refs local dup server dup > ' > > +test_expect_success 'proper failure checks for fetching' ' > + (GIT_REMOTE_TESTGIT_FAILURE=1 && > + export GIT_REMOTE_TESTGIT_FAILURE && > + cd local && > + test_must_fail git fetch 2> error && > + grep "Error while running helper" error > + ) > +' Hmm. If you drop the final "grep" (which is looking for the error message you add elsewhere in this patch), this test passes even without the addition of the check_command calls added by your patch. Which made me wonder if we should be checking something else (repo state, error messages, etc), since we seem to otherwise be detecting the error. More analysis below on exactly what is going on there. > +# We sleep to give fast-export a chance to catch the SIGPIPE I'm not sure what this means; I don't see a sleep anywhere. > +test_expect_failure 'proper failure checks for pushing' ' > + (GIT_REMOTE_TESTGIT_FAILURE=1 && > + export GIT_REMOTE_TESTGIT_FAILURE && > + cd local && > + test_must_fail git push --all 2> error && > + grep "Error while running helper" error > + ) > +' I wondered why this one is marked for failure. Even without check_command, the push produces: error: fast-export died of signal 13 fatal: Error while running fast-export which explains why the test fails: it does not even make it to the check_command call at all. Why do we need check_command here, then? Is it racy/non-deterministic for fast-export to die due to the pipe? > diff --git a/transport-helper.c b/transport-helper.c > index cb3ef7d..dfdfa7a 100644 > --- a/transport-helper.c > +++ b/transport-helper.c > @@ -460,6 +460,10 @@ static int fetch_with_import(struct transport *transport, > > if (finish_command(&fastimport)) > die("Error while running fast-import"); > + > + if (!check_command(data->helper)) > + die("Error while running helper"); > + > argv_array_free_detached(fastimport.argv); I'm still not very excited about this approach, as it is racy to detect errors. E.g., there is nothing to say that the helper does not close the pipe to fast-import prematurely and then die afterwards, leaving the refs unwritten, fast-import happy, but a failed exit code from the helper. The import test still passes even without the check_command part of your patch because some of the refs in refs/testgit do not exist prior to the failed fetch, and therefore also do not exist afterwards. When fetch tries to feed their sha1s to check_everything_connected, you get the funny: fatal: bad object 0000000000000000000000000000000000000000 error: testgit::/home/peff/compile/git/t/trash directory.t5801-remote-helpers/server did not send all necessary objects But if we change the test like this: diff --git a/t/t5801-remote-helpers.sh b/t/t5801-remote-helpers.sh index efe67e2..e968ea9 100755 --- a/t/t5801-remote-helpers.sh +++ b/t/t5801-remote-helpers.sh @@ -167,11 +167,11 @@ test_expect_success 'proper failure checks for fetching' ' ' test_expect_success 'proper failure checks for fetching' ' - (GIT_REMOTE_TESTGIT_FAILURE=1 && + (cd local && + git fetch && + GIT_REMOTE_TESTGIT_FAILURE=1 && export GIT_REMOTE_TESTGIT_FAILURE && - cd local && - test_must_fail git fetch 2> error && - grep "Error while running helper" error + test_must_fail git fetch 2> error ) ' we can see that without your check_command, the failure in the second fetch is not noticed. Adding in your patch does detect this. _But_ it is only checking a specific failure mode of the remote-helper: process death that results in closing the fast-import pipe, which is how your GIT_REMOTE_TESTGIT_FAILURE is implemented (closing the pipe first and then dying is racy). If we then add this on top of your series (fixing the "status" bug in patch 1 that Junio mentioned), the test will start failing (both the original, and the more robust one I showed above): diff --git a/git-remote-testgit b/git-remote-testgit index ca0cf09..41b0780 100755 --- a/git-remote-testgit +++ b/git-remote-testgit @@ -64,6 +64,11 @@ do if test -n "$GIT_REMOTE_TESTGIT_FAILURE" then + # close stdout + exec >/dev/null + # now sleep to make sure fast-import + # has time to die before we exit + sleep 1 exit -1 fi I agree that the failure mode from your patch is probably the most common order for helpers to fail in (i.e., they just die and that's what kills the pipe). But I wonder if we can just cover all cases. Something like: diff --git a/transport-helper.c b/transport-helper.c index dfdfa7a..8562df0 100644 --- a/transport-helper.c +++ b/transport-helper.c @@ -461,7 +461,31 @@ static int fetch_with_import(struct transport *transport, if (finish_command(&fastimport)) die("Error while running fast-import"); - if (!check_command(data->helper)) + /* + * We must disconnect from the helper at this point, because even + * though fast-import may have succeeded, it may only be because the + * helper was not able to feed fast-import all of the data, and what + * fast-import got looked OK (e.g., it may have got nothing if the + * helper died early). We still need to check the return code of the + * helper to make sure it is happy with what it sent. + * + * Since the import command does not require the helper to ever report + * success/failure of the import, we have no mechanism to check for + * problems except to check its exit status. + * + * Callers of the transport code are allowed to make more requests + * of our helper, so we may be disconnecting before they expect in that + * case. However: + * + * 1. Current callers don't do that; after fetching refs, there + * is nothing left for the helper to do. + * + * 2. We transparently start the helper as necessary, so if we were + * to get another request (e.g., to import more refs), we would + * simply start a new instantiation of the helper. + * + */ + if (disconnect_helper(transport) != 0) die("Error while running helper"); argv_array_free_detached(fastimport.argv); which passes both your original test and the more strict one above. Of course adding a done-import capability would be nice to fix the protocol deficiency, but it is more code. > @@ -818,6 +822,10 @@ static int push_refs_with_export(struct transport *transport, > > if (finish_command(&exporter)) > die("Error while running fast-export"); > + > + if (!check_command(data->helper)) > + die("Error while running helper"); > + > push_update_refs_status(data, remote_refs); > return 0; > } And this one is even more likely to race than the import case, I think, as the exporter may send all of its data, exit, and then the helper chugs on it for a bit (converting, sending across the network, etc). If it is still chugging when we run check_command, we will not notice anything here. E.g., if we instrument the failure like this: diff --git a/git-remote-testgit b/git-remote-testgit index ca0cf09..a912bc1 100755 --- a/git-remote-testgit +++ b/git-remote-testgit @@ -75,6 +75,12 @@ do export) if test -n "$GIT_REMOTE_TESTGIT_FAILURE" then + # imagine we read all of fast-export's data + # first, and then die while trying to convert + # it + while read line; do + test "$line" = "done" && break + done exit -1 fi we do not get the sigpipe from fast-export, and depending on the timing, your check_command may or may not do anything (in my tests, it did not). But unlike the import side, the export command _does_ give us a status report back from the helper: it prints an ok/error line for each ref that was exported, followed by a blank line. So we should be able to confirm that we get the blank line at all, and then that each ref was present, which would determine whether the export failed or not without being subject to race conditions. And we seem to do those checks already; the only problem I see is that recvline does not print a message before dying. Would the patch below be sufficient? diff --git a/transport-helper.c b/transport-helper.c index dfdfa7a..cce2062 100644 --- a/transport-helper.c +++ b/transport-helper.c @@ -54,7 +54,7 @@ static int recvline_fh(FILE *helper, struct strbuf *buffer) if (strbuf_getline(buffer, helper, '\n') == EOF) { if (debug) fprintf(stderr, "Debug: Remote helper quit.\n"); - exit(128); + die("remote helper died unexpectedly"); } if (debug) -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html