On Wed, Oct 13 2021, Junio C Hamano wrote: > Derrick Stolee <stolee@xxxxxxxxx> writes: > >>> But even with the shortcomings of expect_failure, it still is much >>> better than claiming that we expect a bogus outcome. >>> >>> Improving the shortcomings of expect_failure would be a much better >>> use of our time than advocating an abuse of expect_sucess, I would >>> think. >> >> I agree that test_expect_failure has these drawbacks. I've recently >> been using _expect_success to document "bad" behavior so we can verify >> that behavior changes when that behavior is fixed. But it does have >> the drawback of looking like we claim the result is by design. > > Yeah, I think I saw (and I think I used the same technique myself) > people expect a bad output with test_expect_success with an in-code > (not in-log) comment that explicitly says "This documents the > current behaviour, which is wrong", and that is a very acceptable > solution, I would think. > >> One possible way to correct this is to create a "test_expected_failure" >> helper that could be placed on the step(s) of the &&-chain that are >> expected to fail. The helper could set some variable to true if the >> failure is hit, and false otherwise. It can also convert a failure >> into a positive result. Then, test_expect_failure could look for that >> variable's value (after verifying that the &&-chain returns success) >> to show that all expected failures completed correctly. > > Yup, I would very much like the direction, and further imagine that > the above approach can be extended to ... > >> This could have the side-effect of having a "fixed" test_expect_failure >> show as a failed test, not a "TODO" message. > > ... avoid such downside. Perhaps call that magic "we know this step > fails currently" test_known_breakage and declare that we deprecate > the use of test_expect_failure in new tests. Such a test might look > like this: > > test_expect_success 'commit error message should not duplicate' ' > test_when_finished "chmod -R u+rwx ." && > chmod u-rwx .git/objects/ && > orig_head=$(git rev-parse HEAD) && > test_must_fail git commit --allow-empty -m "read-only" 2>rawerr && > grep "insufficient permission" rawerr >err && > test_known_breakage test_line_count = 1 err && > new_head=$(git rev-parse HEAD) && > test "$orig_head" = "$new_head" > ' > > which may use your trick to turn both failure and success to OK (to > let the remainder of the test to continue) but signal the > surrounding test_expect_success to say either "TODO know breakage" > or "Fixed". I don't see how it's a downside. Considering the behavior bad now shouldn't entail that we should be fuzzy about testing what exactly *is* happening right now. If one bad but expected state turns into another unexpected bad state the test should fail. In this case this thread spawned off a fix where we print an error twice, instead of once: https://lore.kernel.org/git/cover-v2-0.2-00000000000-20211012T142950Z-avarab@xxxxxxxxx/#t That sucks a bit, but printing pages full of such errors in a loop would be way worse, which we'll hide if we insist on not testing the exact emitted output, or on such "test_known_breakage" helpers. It's just a downside because when we fix bugs we'll need to go through the "expected failure" tests and adjust them, but that seems like a feature to me. Now I can submit a patch that fixes a known bug with no test suite changes, and I might not even notice that I fixed one. We may want to have tests for something that really is nondeterministic, e.g. for the code I added in 2d3c02f5db6 (die(): stop hiding errors due to overzealous recursion guard, 2017-06-21). But that'll only be the case for some tiny minority (or none) of the existing callers of "test_expect_failure". 1. https://lore.kernel.org/git/87tuhmk19c.fsf@xxxxxxxxxxxxxxxxxxx/