On Tue, Oct 12 2021, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes: > >> On Mon, Oct 11 2021, Junio C Hamano wrote: > [...] >> Presumably with test_expect_failure. >> >> I'll change it, in this case we'd end up with a test_expect_success at >> the end, so it doesn't matter much & I don't care. > > I do agree with you that compared to expect_success, which requires > _all_ steps to succeed, so an failure in any of its steps is > immediately noticeable, it is harder to write and keep > expect_failure useful, because it is not like we are happy to see > any failure in any step. We do not expect a failure in many > preparation and conclusion steps in the &&-chain in expect_failure > block, and we consider it is an error if these steps fail. We only > want to mark only a single step to exhibit an expected but undesirable > behaviour. > > But even with the shortcomings of expect_failure, it still is much > better than claiming that we expect a bogus outcome. > > Improving the shortcomings of expect_failure would be a much better > use of our time than advocating an abuse of expect_sucess, I would > think. I'd like to improve it, but I'll have to get any patch in this are past you :) My reading of your opinion from past exchanges is that you find it objectionable to say "this is a success" when it's not the /desired/ behavior, whereas I think it's valuable to just test for and document the exact existing behavior, even if it's not desirable. So you don't really need a function different from test_expect_success, just a comment saying "this should change", or add a ("non-hash so it's not TAP syntax") "TODO" to the description of the test. But if you agree that we shouldn't conflate failures in the different steps I think we're getting somewhere, so to begin with what do you think about the hack in the v2 of my series? https://lore.kernel.org/git/cover-v2-0.2-00000000000-20211012T142950Z-avarab@xxxxxxxxx/ If we were to prompote those semantics to something that test_expect_failure would use it would be the below, which I think is the only sensible way to use it. But that would mean changing all existing test_expect_failure uses in the test suite, so it would need either a pretty large patch, or some incremental steps to get there: But it will mean we can't use it for any test that's actually flaky, so we'll need a test_expect_flaky, or have some test-specific workarounds in those areas. diff --git a/t/t7815-grep-binary.sh b/t/t7815-grep-binary.sh index 90ebb64f46e..9a95c9e7d69 100755 --- a/t/t7815-grep-binary.sh +++ b/t/t7815-grep-binary.sh @@ -64,7 +64,7 @@ test_expect_success 'git grep ile a' ' ' test_expect_failure 'git grep .fi a' ' - git grep .fi a + test_must_fail git grep .fi a ' test_expect_success 'grep respects binary diff attribute' ' diff --git a/t/test-lib.sh b/t/test-lib.sh index 8361b5c1c57..6d9291b7ead 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -728,8 +728,8 @@ test_known_broken_ok_ () { then write_junit_xml_testcase "$* (breakage fixed)" fi - test_fixed=$(($test_fixed+1)) - say_color error "ok $test_count - $@ # TODO known breakage vanished" + test_broken=$(($test_broken+1)) + say_color warn "not ok $test_count - $@ # TODO known breakage" } test_known_broken_failure_ () { @@ -737,8 +737,8 @@ test_known_broken_failure_ () { then write_junit_xml_testcase "$* (known breakage)" fi - test_broken=$(($test_broken+1)) - say_color warn "not ok $test_count - $@ # TODO known breakage" + test_fixed=$(($test_fixed+1)) + say_color error "not ok $test_count - $@ # TODO a 'known breakage' changed behavior!" } test_debug () {