Re: [PATCH 1/3] [RFC] tests: add test_todo() to mark known breakages

Phillip Wood <phillip.wood123@xxxxxxxxx> · Thu, 8 Dec 2022 15:06:36 +0000

Hi Victoria

On 06/12/2022 22:37, Victoria Dye wrote:
Phillip Wood via GitGitGadget wrote:
From: Phillip Wood <phillip.wood@xxxxxxxxxxxxx>
>>
Failing commands are reported by the test harness in the same way as
test_expect_failure() so there is no change in output when migrating
from test_expect_failure() to test_todo(). If a command marked with
test_todo() succeeds then the test will fail. This is designed to make
it easier to see when a command starts succeeding in our CI compared
to using test_expect_failure() where it is easy to fix a failing test
case and not realize it.

test_todo() is built upon test_expect_failure() but accepts commands
starting with test_* in addition to git. As our test_* assertions use
BUG() to signal usage errors any such error will not be hidden by
test_todo().

Should this be so restrictive? I think 'test_todo' would need to handle any
arbitrary command (mostly because of custom functions like
'ensure_not_expanded' in 't1092') to be an easy-to-use drop-in replacement
for 'test_expect_failure'.

I see there's some related discussion in another subthread [1], but I don't
necessarily think removing restrictions (i.e. that the tested command must
be 'git', 'test_*', etc.) on 'test_todo' requires doing the same for
'test_must_fail' et al. to be internally consistent. On one hand,
'test_todo' could be interpreted as an assertion (like 'test_must_fail'),
where we only want to assert on our code - hence the restrictions. From that
perspective, it would make sense to ease restrictions uniformly on all of
our assertion helpers.

On the other hand, I'm interpreting 'test_todo' as
'test_expect_failure_on_line_N' - more of a "post-test result interpreter"
than an assertion helper. So because 'test_expect_failure' doesn't require
the failing line to come from a particular command, I don't think
'test_todo' needs to either. That leaves assertion helpers like
'test_must_fail' out of the scope of this change, avoiding any hairiness of
allowing them to assert on arbitrary code.

What do you think?

I don't think we need to remove the restrictions on 'test_must_fail', 
they seem to be there for a good reason and I'm not aware of anyone 
complaining about being inconvenienced by them. I think of 'test_todo' 
and 'test_must_fail' as being distinct, 'test_todo' only reuses the 
implementation of 'test_must_fail' for convenience rather than any other 
deep reason.

I added the restrictions to 'test_todo' to try and stop it being misused 
but I'm happy to relax them if needed. I'm keen that test_todo is able 
to distinguish between an expected failure and a failure due to the 
wrapped command being misused e.g. 'test_todo grep --invalid-option' 
should report an error. Restricting the commands makes it easier to 
guarantee that but we can always just add checks for other commands as 
we use them. In a way the existing restrictions are kind of pointless 
because test authors can always name their helper functions test_... to 
get round them.

I think you've convinced be to remove the restrictions on what can be 
wrapped by 'test_todo' when I re-roll.

Thanks for your thoughtful comments

Phillip

[1] https://lore.kernel.org/git/221006.86mta8r860.gmgdl@xxxxxxxxxxxxxxxxxxx/

This commit coverts a few tests to show the intended use of
test_todo().  A limitation of test_todo() as it is currently
implemented is that it cannot be used in a subshell.

Signed-off-by: Phillip Wood <phillip.wood@xxxxxxxxxxxxx>