On Tue, Dec 06 2022, Victoria Dye wrote: > Phillip Wood via GitGitGadget wrote: >> From: Phillip Wood <phillip.wood@xxxxxxxxxxxxx> >> >> test_todo() is intended as a fine grained replacement for >> test_expect_failure(). Rather than marking the whole test as failing >> test_todo() is used to mark individual failing commands within a >> test. This approach to writing failing tests allows us to detect >> unexpected failures that are hidden by test_expect_failure(). > > I love this idea! I've nearly been burned a couple of times by the wrong > line in a 'test_expect_failure' triggering the error (e.g., due to bad > syntax earlier in the test). The added specificity of 'test_todo' will help > both reviewers and people fixing the underlying issues demonstrated by > expected-failing tests. > >> >> Failing commands are reported by the test harness in the same way as >> test_expect_failure() so there is no change in output when migrating >> from test_expect_failure() to test_todo(). If a command marked with >> test_todo() succeeds then the test will fail. This is designed to make >> it easier to see when a command starts succeeding in our CI compared >> to using test_expect_failure() where it is easy to fix a failing test >> case and not realize it. >> >> test_todo() is built upon test_expect_failure() but accepts commands >> starting with test_* in addition to git. As our test_* assertions use >> BUG() to signal usage errors any such error will not be hidden by >> test_todo(). > > Should this be so restrictive? I think 'test_todo' would need to handle any > arbitrary command (mostly because of custom functions like > 'ensure_not_expanded' in 't1092') to be an easy-to-use drop-in replacement > for 'test_expect_failure'. > > I see there's some related discussion in another subthread [1], but I don't > necessarily think removing restrictions (i.e. that the tested command must > be 'git', 'test_*', etc.) on 'test_todo' requires doing the same for > 'test_must_fail' et al. to be internally consistent. On one hand, > 'test_todo' could be interpreted as an assertion (like 'test_must_fail'), > where we only want to assert on our code - hence the restrictions. From that > perspective, it would make sense to ease restrictions uniformly on all of > our assertion helpers. > > On the other hand, I'm interpreting 'test_todo' as > 'test_expect_failure_on_line_N' - more of a "post-test result interpreter" > than an assertion helper. So because 'test_expect_failure' doesn't require > the failing line to come from a particular command, I don't think > 'test_todo' needs to either. That leaves assertion helpers like > 'test_must_fail' out of the scope of this change, avoiding any hairiness of > allowing them to assert on arbitrary code. > > What do you think? Are you saying that for the "test_todo" we shouldn't care whether it exits with a "normal" non-zero or a segfault, abort() (e.g. BUG()) etc? That's what the "test_must_fail" v.s. "!" is about. Even if we erased tat distinction I think such a thing would be a marginal improvement on "test_expect_failure", as we'd at least mark what line fails, but like "test_expect_failure" we'd accept segfaults as failures. but as noted in the upthread discussions I think we should do better and still check for segfaults etc. I think we have a couple of "test_expect_failure" now where we expect a segfault, but for the rest we'd like to know if they start segfaulting.