Re: [PATCH 1/3] [RFC] tests: add test_todo() to mark known breakages

Victoria Dye <vdye@xxxxxxxxxx> · Tue, 6 Dec 2022 14:37:07 -0800

Phillip Wood via GitGitGadget wrote:
> From: Phillip Wood <phillip.wood@xxxxxxxxxxxxx>
> 
> test_todo() is intended as a fine grained replacement for
> test_expect_failure(). Rather than marking the whole test as failing
> test_todo() is used to mark individual failing commands within a
> test. This approach to writing failing tests allows us to detect
> unexpected failures that are hidden by test_expect_failure().

I love this idea! I've nearly been burned a couple of times by the wrong
line in a 'test_expect_failure' triggering the error (e.g., due to bad
syntax earlier in the test). The added specificity of 'test_todo' will help
both reviewers and people fixing the underlying issues demonstrated by
expected-failing tests.

> 
> Failing commands are reported by the test harness in the same way as
> test_expect_failure() so there is no change in output when migrating
> from test_expect_failure() to test_todo(). If a command marked with
> test_todo() succeeds then the test will fail. This is designed to make
> it easier to see when a command starts succeeding in our CI compared
> to using test_expect_failure() where it is easy to fix a failing test
> case and not realize it.
> 
> test_todo() is built upon test_expect_failure() but accepts commands
> starting with test_* in addition to git. As our test_* assertions use
> BUG() to signal usage errors any such error will not be hidden by
> test_todo().

Should this be so restrictive? I think 'test_todo' would need to handle any
arbitrary command (mostly because of custom functions like
'ensure_not_expanded' in 't1092') to be an easy-to-use drop-in replacement
for 'test_expect_failure'. 

I see there's some related discussion in another subthread [1], but I don't
necessarily think removing restrictions (i.e. that the tested command must
be 'git', 'test_*', etc.) on 'test_todo' requires doing the same for
'test_must_fail' et al. to be internally consistent. On one hand,
'test_todo' could be interpreted as an assertion (like 'test_must_fail'),
where we only want to assert on our code - hence the restrictions. From that
perspective, it would make sense to ease restrictions uniformly on all of
our assertion helpers. 

On the other hand, I'm interpreting 'test_todo' as
'test_expect_failure_on_line_N' - more of a "post-test result interpreter"
than an assertion helper. So because 'test_expect_failure' doesn't require
the failing line to come from a particular command, I don't think
'test_todo' needs to either. That leaves assertion helpers like
'test_must_fail' out of the scope of this change, avoiding any hairiness of
allowing them to assert on arbitrary code.

What do you think?

[1] https://lore.kernel.org/git/221006.86mta8r860.gmgdl@xxxxxxxxxxxxxxxxxxx/

> 
> This commit coverts a few tests to show the intended use of
> test_todo().  A limitation of test_todo() as it is currently
> implemented is that it cannot be used in a subshell.
> 
> Signed-off-by: Phillip Wood <phillip.wood@xxxxxxxxxxxxx>