On Fri, Mar 19 2021, Johannes Schindelin wrote: > Hi Ævar, > > On Tue, 9 Mar 2021, Ævar Arnfjörð Bjarmason wrote: > >> Rewrite the home-brew JUnit output generation added in >> 22231908151 (tests: optionally write results as JUnit-style .xml, >> 2019-01-29) and other subsequent commits to be generated via the TAP >> output. >> >> This is now possible that the TAP output emitted with --tee is >> guaranteed to be valid TAP, see the preceding commit. >> >> The JUnit output is not bug-for-bug compatible with the previous >> output, but it looks better to me. >> >> This also requires installing TAP::Formatter::JUnit[1], perhaps that's >> not palatable to the users of the --write-junit-xml option. > > Indeed. I am trying to keep the dependencies required for the Windows jobs > of our CI/PR builds to a minimum. I'm taking this to mean the dependency of the TAP::Formatter::JUnit CPAN module. Since the tests already depend on Perl, and presumably that Perl isn't so broken as to not have the built in TAP modules (e.g. TAP::Parser). > Note, also, that the JUnit output was mostly relevant for when we used > Azure Pipelines: it has a specific UI to study test results, figure out > flaky tests, performance, etc. > > Now that we use GitHub Actions, we do not have such a nice test aggregator > anymore, but we might get one again in the future, who knows. So there's no known current active user of this JUnit output target, or at least if such a user exists it's not you anymore? I'm guessing we'd be unlikely to miss someone targeting JUnit from git.git's tests who's not on this list to chime in. So is it not used currently & could be removed? >> In any case, it'll be easy to whip up our own TAP emitter with a >> TAP::Parser and TAP::Formatter, both of whom come with perl itself, >> which we already rely on for tests. >> >> It should also be significantly faster on Windows, > > I really hate to have to harp on this when talking to you, but... well, > how can I say it? Perl is _slooooooooooooow_ on Windows. > > Like, _really_ slow. Ridiculously slow. > > I know, you recently got riled up when Jeff suggested that the Perl hook > calling FSMonior might be slow, but the truth is: it is super slow. It's > not even funny how slow it is. And it fills me with no joy having to > repeat it every time the question comes up whether running any part of Git > that is written in Perl might affect performance on Windows. I really > dislike having to be that messenger. Riled up? No, perplexed: yes. You're referring to https://lore.kernel.org/git/87h7lgfchm.fsf@xxxxxxxxxxxxxxxxxxx/ So as an aside about that particular issue / slowness I haven't been able to reproduce: If it's really a reference to some Windows-specific issue with Perl on Windows in particular that would certainly be helpful for the rest of us trying to follow along wondering where these wildly different performance numbers come from. So not that we don't want to more to some other fsmonitor implementation for other reasons, but the Perl part of that hook would be rather easy to replace with some C JSON encoder or whatever. > I doubt that you will ever be able to replace my (admittedly slightly > hacky) C helper with anything written in Perl that does even come close to > being faster. > > In other words, I fear that your work here might not have the outcome you > hoped for. We shell out to your C helper at least once for every test we run. By converting the TAP after the full run we're only invoking Perl once per-test. If that one Perl invocation is still much more expensive than doing that ~20-50 times per-test we can easily move the loop to the Perl script and invoke Perl once per test suite run. At that point you've got 1 run v.s. >20k runs for the current helper being called in a loop by test-lib.sh But I don't have access to a Windows test box. So maybe it's slower than I'm imagining. How long do the equivalent of these take on Windows (not sure if the time built-in is there)?: $ time perl -MTAP::Harness -wE 'say "hello world"' hello world real 0m0.018s user 0m0.013s sys 0m0.005s $ time (echo hi | helper/test-tool xml-encode) hi
 real 0m0.002s user 0m0.001s sys 0m0.002s In any case, I regret focusing on the TAP::Formatter::JUnit part in this RFC. The real meaty part is being able to have stable TAP output, and thus not having to add N number of output formats into test-lib.sh itself. A tool that consumes that can then be e.g. Perl's TAP tooling, or just some C program or shellscript that calls the code that the below quoted diff is removing (aside from the test-lib.sh part, which would go away). But it would be easier to write such tools using Perl's tooling, as it saves one from writing a parser for TAP. I'd be surprised if Perl was adding much overhead in that context, and I see that the existing Windows CI jobs are running through "prove", so presumably even on Windows e.g. this: time make T=t000*.sh DEFAULT_TEST_TARGET=prove Is not much slower than: time make T=t000*.sh DEFAULT_TEST_TARGET=test I think this is the Nth time we've had some variant of this "Perl in git's tests" discussion. I'm all for finding solutions to whatever issues you have, but I find it confusing that in these discussions you seem to conflate and jump between some or all of (just the ones I'm remembering): 1. Some imagined future state where there's no Perl whatsoever in the test suite or git.git's build toolchain, which isn't the case now, or anywhere in the forseeable future. 2. A future state where GFW etc. don't need to ship a Perl to users (I believe this is either close or already there, I haven't kept up..) 3. The slowness of Perl for: tight loops, a few per test, once per test suite run. 4. How #3 e.g. in a tight loop compares to calling say sed or awk in a tight shell loop. Is perl uniquely bad on Windows, or is it really a gripe about our use of shellscripting in general? 6. Not wanting to (understandably) package any more non-core Perl stuff, understandable, but we always have the option of extending perl/FromCPAN/ if we find something truly useful (in this case just having a custom JUnit XML emitter, if it's actually needed by anyone anymore, should be rather easy). So e.g. in this case I can guess at some of the gaps and *think* that some code in t/ that uses TAP::Parser (a perl core module) and was invoked either once per test-lib.sh invocation, or better (but perhaps not needed, as it would make it a bit more complex) once per test-suite run should be fine. That code could even be implemented as a prove(1) plugin, at which point we'd be invoking Perl exactly as many times as we are now, but I'd rather not paint myself into that particular corner without good reason. >> as we can e.g. write all the *.out files, and only do that conversion at >> the end for all files in a batch, as opposed to the current >> implementation of shelling out to test-tool in a loop for each test. >> >> 1. https://metacpan.org/pod/distribution/TAP-Formatter-JUnit/script/tap2junit >> >> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> >> --- >> Makefile | 1 - >> t/helper/test-date.c | 12 ---- >> t/helper/test-path-utils.c | 21 ------- >> t/helper/test-tool.c | 1 - >> t/helper/test-tool.h | 1 - >> t/helper/test-xml-encode.c | 80 ------------------------ >> t/test-lib.sh | 123 ++++--------------------------------- >> 7 files changed, 12 insertions(+), 227 deletions(-) >> delete mode 100644 t/helper/test-xml-encode.c >> >> diff --git a/Makefile b/Makefile >> index d26b9d62ee9..fa7c52f7a42 100644 >> --- a/Makefile >> +++ b/Makefile >> @@ -748,7 +748,6 @@ TEST_BUILTINS_OBJS += test-urlmatch-normalization.o >> TEST_BUILTINS_OBJS += test-wildmatch.o >> TEST_BUILTINS_OBJS += test-windows-named-pipe.o >> TEST_BUILTINS_OBJS += test-write-cache.o >> -TEST_BUILTINS_OBJS += test-xml-encode.o >> >> # Do not add more tests here unless they have extra dependencies. Add >> # them in TEST_BUILTINS_OBJS above. >> diff --git a/t/helper/test-date.c b/t/helper/test-date.c >> index 099eff4f0fc..ebf133943b3 100644 >> --- a/t/helper/test-date.c >> +++ b/t/helper/test-date.c >> @@ -8,7 +8,6 @@ static const char *usage_msg = "\n" >> " test-tool date parse [date]...\n" >> " test-tool date approxidate [date]...\n" >> " test-tool date timestamp [date]...\n" >> -" test-tool date getnanos [start-nanos]\n" >> " test-tool date is64bit\n" >> " test-tool date time_t-is64bit\n"; >> >> @@ -92,15 +91,6 @@ static void parse_approx_timestamp(const char **argv) >> } >> } >> >> -static void getnanos(const char **argv) >> -{ >> - double seconds = getnanotime() / 1.0e9; >> - >> - if (*argv) >> - seconds -= strtod(*argv, NULL); >> - printf("%lf\n", seconds); >> -} >> - >> int cmd__date(int argc, const char **argv) >> { >> const char *x; >> @@ -120,8 +110,6 @@ int cmd__date(int argc, const char **argv) >> parse_approxidate(argv+1); >> else if (!strcmp(*argv, "timestamp")) >> parse_approx_timestamp(argv+1); >> - else if (!strcmp(*argv, "getnanos")) >> - getnanos(argv+1); >> else if (!strcmp(*argv, "is64bit")) >> return sizeof(timestamp_t) == 8 ? 0 : 1; >> else if (!strcmp(*argv, "time_t-is64bit")) >> diff --git a/t/helper/test-path-utils.c b/t/helper/test-path-utils.c >> index 313a153209c..090596ac492 100644 >> --- a/t/helper/test-path-utils.c >> +++ b/t/helper/test-path-utils.c >> @@ -407,27 +407,6 @@ int cmd__path_utils(int argc, const char **argv) >> return !!res; >> } >> >> - if (argc == 4 && !strcmp(argv[1], "skip-n-bytes")) { >> - int fd = open(argv[2], O_RDONLY), offset = atoi(argv[3]); >> - char buffer[65536]; >> - >> - if (fd < 0) >> - die_errno("could not open '%s'", argv[2]); >> - if (lseek(fd, offset, SEEK_SET) < 0) >> - die_errno("could not skip %d bytes", offset); >> - for (;;) { >> - ssize_t count = read(fd, buffer, sizeof(buffer)); >> - if (count < 0) >> - die_errno("could not read '%s'", argv[2]); >> - if (!count) >> - break; >> - if (write(1, buffer, count) < 0) >> - die_errno("could not write to stdout"); >> - } >> - close(fd); >> - return 0; >> - } >> - >> if (argc > 5 && !strcmp(argv[1], "slice-tests")) { >> int res = 0; >> long offset, stride, i; >> diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c >> index 1876bad8f42..3fa02011b61 100644 >> --- a/t/helper/test-tool.c >> +++ b/t/helper/test-tool.c >> @@ -73,7 +73,6 @@ static struct test_cmd cmds[] = { >> { "tee", cmd__tee }, >> { "trace2", cmd__trace2 }, >> { "urlmatch-normalization", cmd__urlmatch_normalization }, >> - { "xml-encode", cmd__xml_encode }, >> { "wildmatch", cmd__wildmatch }, >> #ifdef GIT_WINDOWS_NATIVE >> { "windows-named-pipe", cmd__windows_named_pipe }, >> diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h >> index 9b3c1f75267..5028730807e 100644 >> --- a/t/helper/test-tool.h >> +++ b/t/helper/test-tool.h >> @@ -63,7 +63,6 @@ int cmd__subprocess(int argc, const char **argv); >> int cmd__tee(int argc, const char **argv); >> int cmd__trace2(int argc, const char **argv); >> int cmd__urlmatch_normalization(int argc, const char **argv); >> -int cmd__xml_encode(int argc, const char **argv); >> int cmd__wildmatch(int argc, const char **argv); >> #ifdef GIT_WINDOWS_NATIVE >> int cmd__windows_named_pipe(int argc, const char **argv); >> diff --git a/t/helper/test-xml-encode.c b/t/helper/test-xml-encode.c >> deleted file mode 100644 >> index a648bbd961c..00000000000 >> --- a/t/helper/test-xml-encode.c >> +++ /dev/null >> @@ -1,80 +0,0 @@ >> -#include "test-tool.h" >> - >> -static const char *utf8_replace_character = "�"; >> - >> -/* >> - * Encodes (possibly incorrect) UTF-8 on <stdin> to <stdout>, to be embedded >> - * in an XML file. >> - */ >> -int cmd__xml_encode(int argc, const char **argv) >> -{ >> - unsigned char buf[1024], tmp[4], *tmp2 = NULL; >> - ssize_t cur = 0, len = 1, remaining = 0; >> - unsigned char ch; >> - >> - for (;;) { >> - if (++cur == len) { >> - len = xread(0, buf, sizeof(buf)); >> - if (!len) >> - return 0; >> - if (len < 0) >> - die_errno("Could not read <stdin>"); >> - cur = 0; >> - } >> - ch = buf[cur]; >> - >> - if (tmp2) { >> - if ((ch & 0xc0) != 0x80) { >> - fputs(utf8_replace_character, stdout); >> - tmp2 = NULL; >> - cur--; >> - continue; >> - } >> - *tmp2 = ch; >> - tmp2++; >> - if (--remaining == 0) { >> - fwrite(tmp, tmp2 - tmp, 1, stdout); >> - tmp2 = NULL; >> - } >> - continue; >> - } >> - >> - if (!(ch & 0x80)) { >> - /* 0xxxxxxx */ >> - if (ch == '&') >> - fputs("&", stdout); >> - else if (ch == '\'') >> - fputs("'", stdout); >> - else if (ch == '"') >> - fputs(""", stdout); >> - else if (ch == '<') >> - fputs("<", stdout); >> - else if (ch == '>') >> - fputs(">", stdout); >> - else if (ch >= 0x20) >> - fputc(ch, stdout); >> - else if (ch == 0x09 || ch == 0x0a || ch == 0x0d) >> - fprintf(stdout, "&#x%02x;", ch); >> - else >> - fputs(utf8_replace_character, stdout); >> - } else if ((ch & 0xe0) == 0xc0) { >> - /* 110XXXXx 10xxxxxx */ >> - tmp[0] = ch; >> - remaining = 1; >> - tmp2 = tmp + 1; >> - } else if ((ch & 0xf0) == 0xe0) { >> - /* 1110XXXX 10Xxxxxx 10xxxxxx */ >> - tmp[0] = ch; >> - remaining = 2; >> - tmp2 = tmp + 1; >> - } else if ((ch & 0xf8) == 0xf0) { >> - /* 11110XXX 10XXxxxx 10xxxxxx 10xxxxxx */ >> - tmp[0] = ch; >> - remaining = 3; >> - tmp2 = tmp + 1; >> - } else >> - fputs(utf8_replace_character, stdout); >> - } >> - >> - return 0; >> -} >> diff --git a/t/test-lib.sh b/t/test-lib.sh >> index 0070d05234b..4dc41eeccc2 100644 >> --- a/t/test-lib.sh >> +++ b/t/test-lib.sh >> @@ -371,6 +371,17 @@ then >> --escape-stdout ${HARNESS_ACTIVE+--escape-file} \ >> "$GIT_TEST_TEE_OUTPUT_FILE" >> test "$(cat "$TEST_RESULTS_BASE.exit")" = 0 >> + if test -n "$write_junit_xml" >> + then >> + junit_xml_dir="$TEST_OUTPUT_DIRECTORY/out" >> + mkdir -p "$junit_xml_dir" >> + junit_xml_base=${0##*/} >> + junit_xml_path="$junit_xml_dir/TEST-${junit_xml_base%.sh}.xml" >> + junit_attrs="timestamp=\"$(TZ=UTC date +%Y-%m-%dT%H:%M:%S)\"" >> + tap2junit --name="$TEST_NAME" - \ >> + <"$GIT_TEST_TEE_OUTPUT_FILE" \ >> + >"$junit_xml_path" >> + fi >> exit >> elif test -n "$verbose" -a -n "$HARNESS_ACTIVE" >> then >> @@ -588,7 +599,6 @@ export TERM >> >> error () { >> say_color error "error: $*" >> - finalize_junit_xml >> GIT_EXIT_OK=t >> exit 1 >> } >> @@ -672,56 +682,24 @@ trap '{ code=$?; set +x; } 2>/dev/null; exit $code' INT TERM HUP >> # the test_expect_* functions instead. >> >> test_ok_ () { >> - if test -n "$write_junit_xml" >> - then >> - write_junit_xml_testcase "$*" >> - fi >> test_success=$(($test_success + 1)) >> say_color_tap "${verbose:+pass}" "ok $test_count - $@" >> } >> >> test_failure_ () { >> - if test -n "$write_junit_xml" >> - then >> - junit_insert="<failure message=\"not ok $test_count -" >> - junit_insert="$junit_insert $(xml_attr_encode "$1")\">" >> - junit_insert="$junit_insert $(xml_attr_encode \ >> - "$(if test -n "$GIT_TEST_TEE_OUTPUT_FILE" >> - then >> - test-tool path-utils skip-n-bytes \ >> - "$GIT_TEST_TEE_OUTPUT_FILE" $GIT_TEST_TEE_OFFSET >> - else >> - printf '%s\n' "$@" | sed 1d >> - fi)")" >> - junit_insert="$junit_insert</failure>" >> - if test -n "$GIT_TEST_TEE_OUTPUT_FILE" >> - then >> - junit_insert="$junit_insert<system-err>$(xml_attr_encode \ >> - "$(cat "$GIT_TEST_TEE_OUTPUT_FILE")")</system-err>" >> - fi >> - write_junit_xml_testcase "$1" " $junit_insert" >> - fi >> test_failure=$(($test_failure + 1)) >> say_color_tap error "not ok $test_count - $1" >> shift >> printf '%s\n' "$*" | sed -e 's/^/# /' >> - test "$immediate" = "" || { finalize_junit_xml; GIT_EXIT_OK=t; exit 1; } >> + test "$immediate" = "" || { GIT_EXIT_OK=t; exit 1; } >> } >> >> test_known_broken_ok_ () { >> - if test -n "$write_junit_xml" >> - then >> - write_junit_xml_testcase "$* (breakage fixed)" >> - fi >> test_fixed=$(($test_fixed+1)) >> say_color error "ok $test_count - $@ # TODO known breakage vanished" >> } >> >> test_known_broken_failure_ () { >> - if test -n "$write_junit_xml" >> - then >> - write_junit_xml_testcase "$* (known breakage)" >> - fi >> test_broken=$(($test_broken+1)) >> say_color_tap warn "not ok $test_count - $@ # TODO known breakage" >> } >> @@ -983,10 +961,6 @@ test_start_ () { >> test_count=$(($test_count+1)) >> maybe_setup_verbose >> maybe_setup_valgrind >> - if test -n "$write_junit_xml" >> - then >> - junit_start=$(test-tool date getnanos) >> - fi >> } >> >> test_finish_ () { >> @@ -1029,13 +1003,6 @@ test_skip () { >> >> case "$to_skip" in >> t) >> - if test -n "$write_junit_xml" >> - then >> - message="$(xml_attr_encode "$skipped_reason")" >> - write_junit_xml_testcase "$1" \ >> - " <skipped message=\"$message\" />" >> - fi >> - >> say_color_tap skip "ok $test_count # skip $1 ($skipped_reason)" >> : true >> ;; >> @@ -1050,53 +1017,6 @@ test_at_end_hook_ () { >> : >> } >> >> -write_junit_xml () { >> - case "$1" in >> - --truncate) >> - >"$junit_xml_path" >> - junit_have_testcase= >> - shift >> - ;; >> - esac >> - printf '%s\n' "$@" >>"$junit_xml_path" >> -} >> - >> -xml_attr_encode () { >> - printf '%s\n' "$@" | test-tool xml-encode >> -} >> - >> -write_junit_xml_testcase () { >> - junit_attrs="name=\"$(xml_attr_encode "$this_test.$test_count $1")\"" >> - shift >> - junit_attrs="$junit_attrs classname=\"$this_test\"" >> - junit_attrs="$junit_attrs time=\"$(test-tool \ >> - date getnanos $junit_start)\"" >> - write_junit_xml "$(printf '%s\n' \ >> - " <testcase $junit_attrs>" "$@" " </testcase>")" >> - junit_have_testcase=t >> -} >> - >> -finalize_junit_xml () { >> - if test -n "$write_junit_xml" && test -n "$junit_xml_path" >> - then >> - test -n "$junit_have_testcase" || { >> - junit_start=$(test-tool date getnanos) >> - write_junit_xml_testcase "all tests skipped" >> - } >> - >> - # adjust the overall time >> - junit_time=$(test-tool date getnanos $junit_suite_start) >> - sed -e "s/\(<testsuite.*\) time=\"[^\"]*\"/\1/" \ >> - -e "s/<testsuite [^>]*/& time=\"$junit_time\"/" \ >> - -e '/^ *<\/testsuite/d' \ >> - <"$junit_xml_path" >"$junit_xml_path.new" >> - mv "$junit_xml_path.new" "$junit_xml_path" >> - >> - write_junit_xml " </testsuite>" "</testsuites>" >> - write_junit_xml= >> - fi >> -} >> - >> test_atexit_cleanup=: >> test_atexit_handler () { >> # In a succeeding test script 'test_atexit_handler' is invoked >> @@ -1119,8 +1039,6 @@ test_done () { >> # removed, so the commands can access pidfiles and socket files. >> test_atexit_handler >> >> - finalize_junit_xml >> - >> if test -z "$HARNESS_ACTIVE" >> then >> mkdir -p "$TEST_RESULTS_DIR" >> @@ -1364,23 +1282,6 @@ then >> test_done >> fi >> >> -if test -n "$write_junit_xml" >> -then >> - junit_xml_dir="$TEST_OUTPUT_DIRECTORY/out" >> - mkdir -p "$junit_xml_dir" >> - junit_xml_base=${0##*/} >> - junit_xml_path="$junit_xml_dir/TEST-${junit_xml_base%.sh}.xml" >> - junit_attrs="name=\"${junit_xml_base%.sh}\"" >> - junit_attrs="$junit_attrs timestamp=\"$(TZ=UTC \ >> - date +%Y-%m-%dT%H:%M:%S)\"" >> - write_junit_xml --truncate "<testsuites>" " <testsuite $junit_attrs>" >> - junit_suite_start=$(test-tool date getnanos) >> - if test -n "$GIT_TEST_TEE_OUTPUT_FILE" >> - then >> - GIT_TEST_TEE_OFFSET=0 >> - fi >> -fi >> - >> # Convenience >> # A regexp to match 5, 35 and 40 hexdigits >> _x05='[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]' >> -- >> 2.31.0.rc1.210.g0f8085a843c >> >>