Hi Junio, I notice that you did not take this into `seen` yet. I find that a little sad because it would potentially have helped others to figure out the failure in the latest `seen`: https://github.com/git/git/runs/5255378056?check_suite_focus=true#step:5:162 Essentially, a recent patch introduces hard-coded SHA-1 hashes in t3007.3. Ciao, Dscho On Mon, 24 Jan 2022, Johannes Schindelin via GitGitGadget wrote: > > Background > ========== > > Recent patches intended to help readers figure out CI failures much quicker > than before. Unfortunately, they haven't been entirely positive for me. For > example, they broke the branch protections in Microsoft's fork of Git, where > we require Pull Requests to pass a certain set of Checks (which are > identified by their names) and therefore caused follow-up work. > > Using CI and in general making it easier for new contributors is an area I'm > passionate about, and one I'd like to see improved. > > > The current situation > ===================== > > Let me walk you through the current experience when a PR build fails: I get > a notification mail that only says that a certain job failed. There's no > indication of which test failed (or was it the build?). I can click on a > link at it takes me to the workflow run. Once there, all it says is "Process > completed with exit code 1", or even "code 2". Sure, I can click on one of > the failed jobs. It even expands the failed step's log (collapsing the other > steps). And what do I see there? > > Let's look at an example of a failed linux-clang (ubuntu-latest) job > [https://github.com/git-for-windows/git/runs/4822802185?check_suite_focus=true]: > > [...] > Test Summary Report > ------------------- > t1092-sparse-checkout-compatibility.sh (Wstat: 256 Tests: 53 Failed: 1) > Failed test: 49 > Non-zero exit status: 1 > t3701-add-interactive.sh (Wstat: 0 Tests: 71 Failed: 0) > TODO passed: 45, 47 > Files=957, Tests=25489, 645 wallclock secs ( 5.74 usr 1.56 sys + 866.28 cusr 364.34 csys = 1237.92 CPU) > Result: FAIL > make[1]: *** [Makefile:53: prove] Error 1 > make[1]: Leaving directory '/home/runner/work/git/git/t' > make: *** [Makefile:3018: test] Error 2 > > > That's it. I count myself lucky not to be a new contributor being faced with > something like this. > > Now, since I am active in the Git project for a couple of days or so, I can > make sense of the "TODO passed" label and know that for the purpose of > fixing the build failures, I need to ignore this, and that I need to focus > on the "Failed test" part instead. > > I also know that I do not have to get myself an ubuntu-latest box just to > reproduce the error, I do not even have to check out the code and run it > just to learn what that "49" means. > > I know, and I do not expect any new contributor, not even most seasoned > contributors to know, that I have to patiently collapse the "Run > ci/run-build-and-tests.sh" job's log, and instead expand the "Run > ci/print-test-failures.sh" job log (which did not fail and hence does not > draw any attention to it). > > I know, and again: I do not expect many others to know this, that I then > have to click into the "Search logs" box (not the regular web browser's > search via Ctrl+F!) and type in "not ok" to find the log of the failed test > case (and this might still be a "known broken" one that is marked via > test_expect_failure and once again needs to be ignored). > > To be excessively clear: This is not a great experience! > > > Improved output > =============== > > Our previous Azure Pipelines-based CI builds had a much nicer UI, one that > even showed flaky tests, and trends e.g. how long the test cases ran. When I > ported Git's CI over to GitHub workflows (to make CI more accessible to new > contributors), I knew fully well that we would leave this very nice UI > behind, and I had hoped that we would get something similar back via new, > community-contributed GitHub Actions that can be used in GitHub workflows. > However, most likely because we use a home-grown test framework implemented > in opinionated POSIX shells scripts, that did not happen. > > So I had a look at what standards exist e.g. when testing PowerShell > modules, in the way of marking up their test output in GitHub workflows, and > I was not disappointed: GitHub workflows support "grouping" of output lines, > i.e. marking sections of the output as a group that is then collapsed by > default and can be expanded. And it is this feature I decided to use in this > patch series, along with GitHub workflows' commands to display errors or > notices that are also shown on the summary page of the workflow run. Now, in > addition to "Process completed with exit code" on the summary page, we also > read something like: > > ⊗ linux-gcc (ubuntu-latest) > failed: t9800.20 submit from detached head > > > Even better, this message is a link, and following that, the reader is > presented with something like this > [https://github.com/dscho/git/runs/4840190622?check_suite_focus=true]: > > ⏵ Run ci/run-build-and-tests.sh > ⏵ CI setup > + ln -s /home/runner/none/.prove t/.prove > + run_tests=t > + export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main > + group Build make > + set +x > ⏵ Build > ⏵ Run tests > === Failed test: t9800-git-p4-basic === > ⏵ ok: t9800.1 start p4d > ⏵ ok: t9800.2 add p4 files > ⏵ ok: t9800.3 basic git p4 clone > ⏵ ok: t9800.4 depot typo error > ⏵ ok: t9800.5 git p4 clone @all > ⏵ ok: t9800.6 git p4 sync uninitialized repo > ⏵ ok: t9800.7 git p4 sync new branch > ⏵ ok: t9800.8 clone two dirs > ⏵ ok: t9800.9 clone two dirs, @all > ⏵ ok: t9800.10 clone two dirs, @all, conflicting files > ⏵ ok: t9800.11 clone two dirs, each edited by submit, single git commit > ⏵ ok: t9800.12 clone using non-numeric revision ranges > ⏵ ok: t9800.13 clone with date range, excluding some changes > ⏵ ok: t9800.14 exit when p4 fails to produce marshaled output > ⏵ ok: t9800.15 exit gracefully for p4 server errors > ⏵ ok: t9800.16 clone --bare should make a bare repository > ⏵ ok: t9800.17 initial import time from top change time > ⏵ ok: t9800.18 unresolvable host in P4PORT should display error > ⏵ ok: t9800.19 run hook p4-pre-submit before submit > Error: failed: t9800.20 submit from detached head > ⏵ failure: t9800.20 submit from detached head > Error: failed: t9800.21 submit from worktree > ⏵ failure: t9800.21 submit from worktree > === Failed test: t9801-git-p4-branch === > [...] > > > The "Failed test:" lines are colored in yellow to give a better visual clue > about the logs' structure, the "Error:" label is colored in red to draw the > attention to the important part of the log, and the "⏵" characters indicate > that part of the log is collapsed and can be expanded by clicking on it. > > To drill down, the reader merely needs to expand the (failed) test case's > log by clicking on it, and then study the log. If needed (e.g. when the test > case relies on side effects from previous test cases), the logs of preceding > test cases can be expanded as well. In this example, when expanding > t9800.20, it looks like this (for ease of reading, I cut a few chunks of > lines, indicated by "[...]"): > > [...] > ⏵ ok: t9800.19 run hook p4-pre-submit before submit > Error: failed: t9800.20 submit from detached head > ⏷ failure: t9800.20 submit from detached head > test_when_finished cleanup_git && > git p4 clone --dest="$git" //depot && > ( > cd "$git" && > git checkout p4/master && > >detached_head_test && > git add detached_head_test && > git commit -m "add detached_head" && > git config git-p4.skipSubmitEdit true && > git p4 submit && > git p4 rebase && > git log p4/master | grep detached_head > ) > [...] > Depot paths: //depot/ > Import destination: refs/remotes/p4/master > > Importing revision 9 (100%)Perforce db files in '.' will be created if missing... > Perforce db files in '.' will be created if missing... > > Traceback (most recent call last): > File "/home/runner/work/git/git/git-p4", line 4455, in <module> > main() > File "/home/runner/work/git/git/git-p4", line 4449, in main > if not cmd.run(args): > File "/home/runner/work/git/git/git-p4", line 2590, in run > rebase.rebase() > File "/home/runner/work/git/git/git-p4", line 4121, in rebase > if len(read_pipe("git diff-index HEAD --")) > 0: > File "/home/runner/work/git/git/git-p4", line 297, in read_pipe > retcode, out, err = read_pipe_full(c, *k, **kw) > File "/home/runner/work/git/git/git-p4", line 284, in read_pipe_full > p = subprocess.Popen( > File "/usr/lib/python3.8/subprocess.py", line 858, in __init__ > self._execute_child(args, executable, preexec_fn, close_fds, > File "/usr/lib/python3.8/subprocess.py", line 1704, in _execute_child > raise child_exception_type(errno_num, err_msg, err_filename) > FileNotFoundError: [Errno 2] No such file or directory: 'git diff-index HEAD --' > error: last command exited with $?=1 > + cleanup_git > + retry_until_success rm -r /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git > + nr_tries_left=60 > + rm -r /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git > + test_path_is_missing /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git > + test 1 -ne 1 > + test -e /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git > + retry_until_success mkdir /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git > + nr_tries_left=60 > + mkdir /home/runner/work/git/git/t/trash directory.t9800-git-p4-basic/git > + exit 1 > + eval_ret=1 > + : > not ok 20 - submit from detached head > # > # test_when_finished cleanup_git && > # git p4 clone --dest="$git" //depot && > # ( > # cd "$git" && > # git checkout p4/master && > # >detached_head_test && > # git add detached_head_test && > # git commit -m "add detached_head" && > # git config git-p4.skipSubmitEdit true && > # git p4 submit && > # git p4 rebase && > # git log p4/master | grep detached_head > # ) > # > Error: failed: t9800.21 submit from worktree > [...] > > > Is this the best UI we can have for test failures in CI runs? I hope we can > do better. Having said that, this patch series presents a pretty good start, > and offers a basis for future improvements. > > Johannes Schindelin (9): > ci: fix code style > ci/run-build-and-tests: take a more high-level view > ci: make it easier to find failed tests' logs in the GitHub workflow > ci/run-build-and-tests: add some structure to the GitHub workflow > output > tests: refactor --write-junit-xml code > test(junit): avoid line feeds in XML attributes > ci: optionally mark up output in the GitHub workflow > ci: use `--github-workflow-markup` in the GitHub workflow > ci: call `finalize_test_case_output` a little later > > .github/workflows/main.yml | 12 --- > ci/lib.sh | 81 ++++++++++++++-- > ci/run-build-and-tests.sh | 11 ++- > ci/run-test-slice.sh | 5 +- > t/test-lib-functions.sh | 4 +- > t/test-lib-github-workflow-markup.sh | 50 ++++++++++ > t/test-lib-junit.sh | 132 +++++++++++++++++++++++++++ > t/test-lib.sh | 128 ++++---------------------- > 8 files changed, 287 insertions(+), 136 deletions(-) > create mode 100644 t/test-lib-github-workflow-markup.sh > create mode 100644 t/test-lib-junit.sh > > > base-commit: af4e5f569bc89f356eb34a9373d7f82aca6faa8a > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1117%2Fdscho%2Fuse-grouping-in-ci-v1 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1117/dscho/use-grouping-in-ci-v1 > Pull-Request: https://github.com/gitgitgadget/git/pull/1117 > -- > gitgitgadget > >