Re: [PATCH v2 2/6] unit-tests: add for_test

Kyle Lippincott <spectral@xxxxxxxxxx> · Mon, 22 Jul 2024 12:13:00 -0700

On Sat, Jul 20, 2024 at 11:22 PM René Scharfe <l.s.r@xxxxxx> wrote:
>
> The macro TEST only allows defining a test that consists of a single
> expression.  Add a new macro, for_test, which provides a way to define
> unit tests that are made up of one or more statements.
>
> for_test allows defining self-contained tests en bloc, a bit like
> test_expect_success does for regular tests.  It acts like a for loop
> that runs at most once; the test body is executed if test_skip_all()
> had not been called before.

I can see based on this description where the name came from, but
without this context, it's not clear when reading a test what it
actually does. The name comes from an implementation detail, and is
not describing what it _does_, just _how_ it does it.

Maybe a name like `small_test` or `quick_test` or something like that
would better express the intended usage?

>
> Helped-by: Phillip Wood <phillip.wood123@xxxxxxxxx>
> Signed-off-by: René Scharfe <l.s.r@xxxxxx>
> ---
>  .clang-format               |  2 ++
>  t/helper/test-example-tap.c | 33 +++++++++++++++++++++++++++++++++
>  t/t0080/expect              | 35 ++++++++++++++++++++++++++++++++++-
>  t/unit-tests/test-lib.h     | 19 +++++++++++++++++++
>  4 files changed, 88 insertions(+), 1 deletion(-)
>
> diff --git a/.clang-format b/.clang-format
> index 6408251577..863dc87dfc 100644
> --- a/.clang-format
> +++ b/.clang-format
> @@ -151,6 +151,7 @@ Cpp11BracedListStyle: false
>  # function calls. Taken from:
>  #   git grep -h '^#define [^[:space:]]*for_\?each[^[:space:]]*(' |
>  #   sed "s/^#define /  - '/; s/(.*$/'/" | sort | uniq
> +# Added for_test from t/unit-tests/test-lib.h manually as a special case.
>  ForEachMacros:
>    - 'for_each_builtin'
>    - 'for_each_string_list_item'
> @@ -168,6 +169,7 @@ ForEachMacros:
>    - 'strintmap_for_each_entry'
>    - 'strmap_for_each_entry'
>    - 'strset_for_each_entry'
> +  - 'for_test'
>
>  # The maximum number of consecutive empty lines to keep.
>  MaxEmptyLinesToKeep: 1
> diff --git a/t/helper/test-example-tap.c b/t/helper/test-example-tap.c
> index d072ad559f..51d5e6e75b 100644
> --- a/t/helper/test-example-tap.c
> +++ b/t/helper/test-example-tap.c
> @@ -92,5 +92,38 @@ int cmd__example_tap(int argc, const char **argv)
>         test_res = TEST(t_empty(), "test with no checks");
>         TEST(check_int(test_res, ==, 0), "test with no checks returns 0");
>
> +       for_test ("for_test passing test")
> +               check_int(1, ==, 1);

I'm concerned that users will write this like:
+       for_test ("for_test passing test");
+               check_int(1, ==, 1);

And the issue won't be caught. Maybe that's fine; the only issue here
is that it's not going to be as descriptive if it fails, right?

> +       for_test ("for_test failing test")
> +               check_int(1, ==, 2);
> +       for_test ("for_test passing TEST_TODO()")
> +               TEST_TODO(check(0));
> +       for_test ("for_test failing TEST_TODO()")
> +               TEST_TODO(check(1));
> +       for_test ("for_test test_skip()") {
> +               check(0);
> +               test_skip("missing prerequisite");
> +               check(1);
> +       }
> +       for_test ("for_test test_skip() inside TEST_TODO()")
> +               TEST_TODO((test_skip("missing prerequisite"), 1));
> +       for_test ("for_test TEST_TODO() after failing check") {
> +               check(0);
> +               TEST_TODO(check(0));
> +       }
> +       for_test ("for_test failing check after TEST_TODO()") {
> +               check(1);
> +               TEST_TODO(check(0));
> +               check(0);
> +       }
> +       for_test ("for_test messages from failing string and char comparison") {
> +               check_str("\thello\\", "there\"\n");
> +               check_str("NULL", NULL);
> +               check_char('a', ==, '\n');
> +               check_char('\\', ==, '\'');
> +       }
> +       for_test ("for_test test with no checks")
> +               ; /* nothing */
> +
>         return test_done();
>  }
> diff --git a/t/t0080/expect b/t/t0080/expect
> index 0cfa0dc6d8..583f41b8c9 100644
> --- a/t/t0080/expect
> +++ b/t/t0080/expect
> @@ -40,4 +40,37 @@ not ok 17 - messages from failing string and char comparison
>  # BUG: test has no checks at t/helper/test-example-tap.c:92
>  not ok 18 - test with no checks
>  ok 19 - test with no checks returns 0
> -1..19
> +ok 20 - for_test passing test
> +# check "1 == 2" failed at t/helper/test-example-tap.c:98
> +#    left: 1
> +#   right: 2
> +not ok 21 - for_test failing test
> +not ok 22 - for_test passing TEST_TODO() # TODO
> +# todo check 'check(1)' succeeded at t/helper/test-example-tap.c:102
> +not ok 23 - for_test failing TEST_TODO()
> +# check "0" failed at t/helper/test-example-tap.c:104
> +# skipping test - missing prerequisite
> +# skipping check '1' at t/helper/test-example-tap.c:106
> +ok 24 - for_test test_skip() # SKIP
> +# skipping test - missing prerequisite
> +ok 25 - for_test test_skip() inside TEST_TODO() # SKIP
> +# check "0" failed at t/helper/test-example-tap.c:111
> +not ok 26 - for_test TEST_TODO() after failing check
> +# check "0" failed at t/helper/test-example-tap.c:117
> +not ok 27 - for_test failing check after TEST_TODO()
> +# check "!strcmp("\thello\\", "there\"\n")" failed at t/helper/test-example-tap.c:120
> +#    left: "\011hello\\"
> +#   right: "there\"\012"
> +# check "!strcmp("NULL", NULL)" failed at t/helper/test-example-tap.c:121
> +#    left: "NULL"
> +#   right: NULL
> +# check "'a' == '\n'" failed at t/helper/test-example-tap.c:122
> +#    left: 'a'
> +#   right: '\012'
> +# check "'\\' == '\''" failed at t/helper/test-example-tap.c:123
> +#    left: '\\'
> +#   right: '\''
> +not ok 28 - for_test messages from failing string and char comparison
> +# BUG: test has no checks at t/helper/test-example-tap.c:125
> +not ok 29 - for_test test with no checks
> +1..29
> diff --git a/t/unit-tests/test-lib.h b/t/unit-tests/test-lib.h
> index 2de6d715d5..12afd47ac9 100644
> --- a/t/unit-tests/test-lib.h
> +++ b/t/unit-tests/test-lib.h
> @@ -14,6 +14,25 @@
>         test__run_end(test__run_begin() ? 0 : (t, 1),   \
>                       TEST_LOCATION(),  __VA_ARGS__)
>
> +/*
> + * Run a test unless test_skip_all() has been called.  Acts like a for
> + * loop that runs at most once, with the test description between the
> + * parentheses and the test body as a statement or block after them.
> + * The description for each test should be unique.  E.g.:
> + *
> + *  for_test ("something else %d %d", arg1, arg2) {
> + *          prepare();
> + *          test_something_else(arg1, arg2);
> + *          cleanup();
> + *  }
> + */
> +#define for_test(...)                                                  \
> +       for (int for_test_running_ = test__run_begin() ?                \
> +               (test__run_end(0, TEST_LOCATION(), __VA_ARGS__), 0) : 1;\
> +            for_test_running_;                                         \
> +            test__run_end(1, TEST_LOCATION(), __VA_ARGS__),            \
> +               for_test_running_ = 0)

IMHO: this is borderline "too much magic" for my tastes. I think
having multiple test functions is generally easier to understand, and
the overhead isn't really relevant. It's not _as_ compact in the
source file, and requires that we have both the TEST statement and the
function (and forgetting the TEST statement means that we won't invoke
the function). If that is the main issue we're facing here, I wonder
if there's some other way of resolving that (such as unused function
detection via some compiler flags; even if it's not detected on all
platforms, getting at least one of the CI platforms should be
sufficient to prevent the issue [but we should target as many as
possible, so it's caught earlier than "I tried to send it to the
list"])

If others agree that this is a good simplification for the people
reading the test code (and hopefully for the test author), I'm fine
with this going in (with a different name). I'm not trying to veto the
concept.

> +
>  /*
>   * Print a test plan, should be called before any tests. If the number
>   * of tests is not known in advance test_done() will automatically
> --
> 2.45.2
>