Add an alternative to the "test_expect_failure" function. Like "test_expect_failure" it will marks a test as "not ok ... TODO" in the TAP output, and thus document that it's a "TODO" test that fails currently. Unlike "test_expect_failure" it asserts that the tested code in exactly one manner, and not any other. We'll thus stop conflating e.g. segfaults with other sorts of errors, and generally be able to tell if our "expected to fail" tests start failing in new and unexpected ways. In more detail, the these problems of "test_expect_failure" are solved by this new function. * When "test_expect_failure" is used in combination with "test_{must,might}_fail" it will hide segfaults or abort() failures, such as failures due to LSAN. This is because we do do the right thing in those helpers, but they themselves are expected to be used within "test_expect_success" and return 1. The "test_expect_failure" can't differentiate this from a "real" failure. We could change "test_{must,might}_fail" to appropriately carry forward the status code to work around this specific issue, but doing so would be a large semantic change in the current test suite. * More generally, "test_expect_failure" by design doesn't test what does, but just asserts that the test fails in some way. For some tests that truly fail in unpredictable ways this behavior makes sense, but it's almost never what our tests want. We know we e.g. have a revision at HEAD~ instead of HEAD, and would like to know conflate that with a potential segfault or other behavior change. In summary, this new function behaves exactly like "test_expect_success", except that its "success" is then reported as a "not ok .. TODO". Let's convert a few "test_expect_failure" uses to the new "test_expect_todo". For previous discussion on this topic see [1] and [2], in particular the points by Junio that it's desired that the "test_expect_failure" output differentiate the TODO tests from "test_expect_success". 1. https://lore.kernel.org/git/87tuhmk19c.fsf@xxxxxxxxxxxxxxxxxxx/ 2. https://lore.kernel.org/git/xmqqo824e145.fsf@gitster.g/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> --- t/README | 20 +++++++++++++++++++- t/t0000-basic.sh | 11 ++++++++--- t/t1060-object-corruption.sh | 4 ++-- t/t7815-grep-binary.sh | 4 ++-- t/test-lib-functions.sh | 28 ++++++++++++++++++++++++---- t/test-lib.sh | 32 ++++++++++++++++++++++++++++---- 6 files changed, 83 insertions(+), 16 deletions(-) diff --git a/t/README b/t/README index f48e0542cdc..e0aa8ebb0eb 100644 --- a/t/README +++ b/t/README @@ -805,10 +805,28 @@ see test-lib-functions.sh for the full list and their options. test_expect_success PERL,PYTHON 'yo dawg' \ ' test $(perl -E 'print eval "1 +" . qx[python -c "print 2"]') == "4" ' + - test_expect_todo [<prereq>] <message> <script> + + A "test_expect_success" which on "success" will mark the test as a + TODO test, and on failure will emit a real failure. + + This should be used to mark a test whose exact bad behavior is + known, but whose outcome isn't preferred, to distinguish it from + those tests that use "test_expect_success" to indicate known and + preferred behavior. + + Like test_expect_success this function can optionally use a three + argument invocation with a prerequisite as the first argument. + - test_expect_failure [<prereq>] <message> <script> This is NOT the opposite of test_expect_success, but is used - to mark a test that demonstrates a known breakage. Unlike + to mark a test that demonstrates a known breakage whose exact failure + behavior isn't predictable. + + If the exact breakage is known the "test_expect_todo" function + should be used instead. Usethis function if it's hard to pin down + the exact nature of the failure. Unlike the usual test_expect_success tests, which say "ok" on success and "FAIL" on failure, this will say "FIXED" on success and "still broken" on failure. Failures from these diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh index b007f0efef2..e46638f1f7b 100755 --- a/t/t0000-basic.sh +++ b/t/t0000-basic.sh @@ -176,6 +176,8 @@ test_expect_success 'subtest: mixed results: a mixture of all possible results' test_expect_failure "pretend we have a known breakage" "false" test_expect_failure "pretend we have a known breakage" "false" test_expect_failure "pretend we have fixed a known breakage" "true" + test_expect_todo "pretend we have a known TODO" "true" + test_expect_todo "pretend we have a bad TODO" "false" test_done EOF check_sub_test_lib_test mixed-results2 <<-\EOF @@ -192,10 +194,13 @@ test_expect_success 'subtest: mixed results: a mixture of all possible results' > not ok 8 - pretend we have a known breakage # TODO known breakage > not ok 9 - pretend we have a known breakage # TODO known breakage > ok 10 - pretend we have fixed a known breakage # TODO known breakage vanished + > not ok 11 - pretend we have a known TODO # TODO known breakage + > not ok 12 - pretend we have a bad TODO (broken '\''test_expect_todo'\''!) + > # false > # 1 known breakage(s) vanished; please update test(s) - > # still have 2 known breakage(s) - > # failed 3 among remaining 7 test(s) - > 1..10 + > # still have 3 known breakage(s) + > # failed 4 among remaining 8 test(s) + > 1..12 EOF ' diff --git a/t/t1060-object-corruption.sh b/t/t1060-object-corruption.sh index bc89371f534..b7023f5644c 100755 --- a/t/t1060-object-corruption.sh +++ b/t/t1060-object-corruption.sh @@ -112,8 +112,8 @@ test_expect_success 'clone --local detects missing objects' ' test_must_fail git clone --local missing missing-checkout ' -test_expect_failure 'clone --local detects misnamed objects' ' - test_must_fail git clone --local misnamed misnamed-checkout +test_expect_todo 'clone --local detects misnamed objects' ' + git clone --local misnamed misnamed-checkout ' test_expect_success 'fetch into corrupted repo with index-pack' ' diff --git a/t/t7815-grep-binary.sh b/t/t7815-grep-binary.sh index ac871287c03..2d17e05036e 100755 --- a/t/t7815-grep-binary.sh +++ b/t/t7815-grep-binary.sh @@ -64,8 +64,8 @@ test_expect_success 'git grep ile a' ' git grep ile a ' -test_expect_failure 'git grep .fi a' ' - git grep .fi a +test_expect_todo 'git grep .fi a' ' + test_must_fail git grep .fi a ' test_expect_success 'grep respects binary diff attribute' ' diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index 0f439c99d61..a219b126d93 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -718,11 +718,13 @@ test_verify_prereq () { BUG "'$test_prereq' does not look like a prereq" } -test_expect_failure () { +_test_expect_todo () { + local func=$1 + shift test_start_ test "$#" = 3 && { test_prereq=$1; shift; } || test_prereq= test "$#" = 2 || - BUG "not 2 or 3 parameters to test-expect-failure" + BUG "not 2 or 3 parameters to $func" test_verify_prereq export test_prereq if ! test_skip "$@" @@ -730,14 +732,32 @@ test_expect_failure () { say >&3 "checking known breakage of $TEST_NUMBER.$test_count '$1': $2" if test_run_ "$2" expecting_failure then - test_known_broken_ok_ "$1" + case "$func" in + test_expect_todo) test_known_broken_ok_ "$func" "$1" ;; + test_expect_failure) test_known_broken_ok_ "$func" "$1" ;; + esac else - test_known_broken_failure_ "$1" + case "$func" in + test_expect_todo) + test_title_=$1 + shift + test_failure_ "$test_title_ (broken 'test_expect_todo'!)" "$@" + ;; + test_expect_failure) test_known_broken_failure_ "$func" "$1" ;; + esac fi fi test_finish_ } +test_expect_failure () { + _test_expect_todo test_expect_failure "$@" +} + +test_expect_todo () { + _test_expect_todo test_expect_todo "$@" +} + test_expect_success () { test_start_ test "$#" = 3 && { test_prereq=$1; shift; } || test_prereq= diff --git a/t/test-lib.sh b/t/test-lib.sh index 9af5fb7674d..fb23a6f6682 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -792,21 +792,45 @@ test_failure_ () { } test_known_broken_ok_ () { + local func=$1 + shift + if test -n "$write_junit_xml" then write_junit_xml_testcase "$* (breakage fixed)" fi - test_fixed=$(($test_fixed+1)) - say_color error "ok $test_count - $@ # TODO known breakage vanished" + + if test "$func" = "test_expect_todo" + then + # test_expect_todo + test_broken=$(($test_broken+1)) + say_color warn "not ok $test_count - $@ # TODO known breakage" + else + # test_expect_failure + test_fixed=$(($test_fixed+1)) + say_color error "ok $test_count - $@ # TODO known breakage vanished" + fi } test_known_broken_failure_ () { + local func=$1 + shift + if test -n "$write_junit_xml" then write_junit_xml_testcase "$* (known breakage)" fi - test_broken=$(($test_broken+1)) - say_color warn "not ok $test_count - $@ # TODO known breakage" + + if test "$func" = "test_expect_todo" + then + # test_expect_todo + test_fixed=$(($test_fixed+1)) + say_color error "not ok $test_count - $@ # TODO a 'known breakage' changed behavior!" + else + # test_expect_failure + test_broken=$(($test_broken+1)) + say_color warn "not ok $test_count - $@ # TODO known breakage" + fi } test_debug () { -- 2.35.1.1436.g756b814e59f