On Fri, Feb 9, 2018 at 3:21 PM, Jeff King <peff@xxxxxxxx> wrote: > On Fri, Feb 09, 2018 at 03:42:34AM +0100, SZEDER Gábor wrote: > >> To check that a git command fails and to inspect its error message we >> usually execute a command like this throughout our test suite: >> >> test_must_fail git command --option 2>output.err >> >> Note that this command doesn't limit the redirection to the git >> command, but it redirects the standard error of the 'test_must_fail' >> helper function as well. This is wrong for several reasons: >> >> - If that git command were to succeed or die for an unexpected >> reason e.g. signal, then 'test_must_fail's helpful error message >> would end up in the given file instead of on the terminal or in >> 't/test-results/$T.out', when the test is run with '-v' or >> '--verbose-log', respectively. >> >> - If the test is run with '-x', the trace of executed commands in >> 'test_must_fail' will go to stderr as well (except for more recent >> Bash versions supporting $BASH_XTRACEFD), and then in turn will >> end up in the given file. > > I have to admit that I'm slightly negative on this approach Well, now I'm not just slightly negative, see the patch series I will send out in a minute :) > just > because: > > 1. It requires every caller to do something special, rather than just > using normal redirection. And the feature isn't as powerful as > redirection. E.g., some callers do: > > test_must_fail foo >output 2>&1 Yeah, they do, but most of the time they shouldn't. I've looked at uses of 'test_must_fail cmd... 2>&1', and also uses of other test helper functions with stderr duplicated from stdout (and a couple of cases of "plain" 'git cmd ... 2>&1' as well; that's where the t6300-for-each-ref fix came from some days ago). I think all but one of those cases would benefit from handling stdout and stderr separately, and that one exception shouldn't use 'test_must_fail' for a platform command in the first place[1]. - Many of these tests check error messages after they mingled stderr and stdout like this: test_must_fail git cmd >out 2>&1 && (test_i18n)grep "relevant part of the error msg" out In these cases there is no need for 2>&1 as they don't care about stdout at all. They could just grep the failed command's stderr, with the added benefit that they would ensure that the error message indeed goes to stderr. - A couple of these tests check the output more strictly: test_must_fail git cmd >actual 2>&1 && echo "error message" >expect && test_(i18n)cmp expect actual which also checks that nothing was printed on stdout, but does so implicitly. Considering all the sloppy uses of 2>&1, it's hard to tell whether it was deliberate to check the empty stdout, or accidental, because the writer of the test used 'test_cmp' instead of 'grep' for its verbosity on failure, or simply mixed up the two. Therefore, I think it's better when it's written more explicitly: test_must_fail git cmd >out 2>err && echo "exact error message" >expect && test_cmp expect err && test_must_be_empty out ... both when reading such a test and when looking at it's output on failure. - Then there are some cases that, for some reason, completely silence a git command with: test_must_fail git cmd >/dev/null 2>&1 The output of a command could turn out to be useful when debugging a failing tests, so tests shouldn't do this, unless there is a very good reason (e.g. the command produces a lot of output). - Finally, a few tests check that something is surely missing from a git command's output, both from stdout and stderr: test_must_fail git cmd >out 2>&1 && ! grep "should not be there" out Or that a command is completely silent: test_must_fail git cmd --quiet >out 2>&1 && test_must_be_empty out Checking the emptiness of stdout and stderr separately would give more info on failure, as it would tell us right away where the unexpected output is coming from (though arguably the presence of words like "error" or "fatal" in the output of 'test_must_be_empty' would be a dead giveaway). BTW, we already have a couple of careful tests that do: git cmd >out 2>err && test_must_be_empty out && test_must_be_empty err (Which makes me think that extending 'test_must_be_empty' to check the emptiness of multiple files at once would be worth it: fewer lines, fewer characters to type, and it could report all non-empty files, not just the first.) > But: > > test_must_fail stderr=output foo >output > > is not quite right (stdout and stderr outputs may clobber each > other, because they have independent position pointers). Relying on the combination of the unbuffered stderr and the block-buffered stdout could be fragile to begin with, though in general it doesn't really cause troubles if the data on stdout is less than the buffer size or if there's no output on stderr after the stdout buffer is flushed. The stdout buffer size is pagesize on Linux (I don't know offhand about others), which is large enough compared to the typical amount of output git commands produce in our test scripts, so we could get away with it. > 2. The "-x" problems aren't specific to test_must_fail at all. They're > a general issue with shell functions. Sure. I tried to address 'test_must_fail' first, because, with about 90% of the cases, it is by far the worst offender among all test helper functions with redirected stderr. 'test_expect_code' comes in second far behind, and there are a mere handful of the combination of 'test_terminal' and 'perl'. The 'test_expect_code 129 ... 2>&1' in t0012-help.sh's row of '$builtin can handle -h' tests is particularly interesting, because it hides the inconsistency that 76 commands send their help/usage text to stdout, while 45 to stderr. > I'm not entirely happy with saying "if you want to use -x, please use > bash". But given that it actually solves the problems everywhere with no > further effort, is it really that bad a solution? Well, not that bad a workaround, but in my opinion the real solution is to make '-x' tracing work reliably with /bin/sh. Since we strive for POSIX shell compatibility in our test scripts, I dutifully run the test suite with /bin/sh. Switching shells for '-x' to get more info about what went wrong (and then switching back) is a hassle even when it happens to work, and rather annoying when it doesn't. I've lost count of the various ways of running tests with '-x' and Bash that I naively thought would Just Work, but didn't because of the subtleties of GIT-BUILD-OPTIONS and/or re-exec in case of '--verbose-log'. I find I still reached for various ad-hoc debugging/tracing or 'test_pause' first, and kept '-x' only as last resort. If we start running tests with '-x' and Bash on Travis CI, then it won't be able to (semi-)automatically catch non-POSIX constructs entering the test scripts. And I do want to run all tests with '-x', to get the most information about rarely occurring transient errors. [1] - 't/lib-git-p4.sh' contains the following command: test_must_fail kill $pid >/dev/null 2>&1