Hi, this patch series addresses a couple of issues I've found while investigating flaky CI jobs. Besides two more fixes for flaky jobs it also removes some stale code and simplifies the setup on GitHub Actions to always use containerized jobs on Linux. Test runs can be found for GitLab [1] and GitHub [2]. Changes in v2: - Expand a bit on the reasoning behind the conversion to use containerized jobs. - Fix commit message typo. - Properly fix the race in t7422 via pipe stuffing, as proposed by Peff. - Link to v1: https://lore.kernel.org/r/20250103-b4-pks-ci-fixes-v1-0-a9bb95dff833@xxxxxx Changes in v3: - Another iteration on the SIGPIPE test, which should now finally plug the race. - Link to v2: https://lore.kernel.org/r/20250106-b4-pks-ci-fixes-v2-0-06ae540771b7@xxxxxx Changes in v4: - Improve the commit message of the SIGPIPE test commit to more accurately describe the race. - Link to v3: https://lore.kernel.org/r/20250107-b4-pks-ci-fixes-v3-0-546a0ebc8481@xxxxxx Thanks! Patrick [1]: https://gitlab.com/gitlab-org/git/-/merge_requests/277 [2]: https://github.com/git/git/pull/1865 --- Patrick Steinhardt (10): t0060: fix EBUSY in MinGW when setting up runtime prefix t7422: fix flaky test caused by buffered stdout github: adapt containerized jobs to be rootless github: convert all Linux jobs to be containerized github: simplify computation of the job's distro gitlab-ci: remove the "linux-old" job gitlab-ci: add linux32 job testing against i386 ci: stop special-casing for Ubuntu 16.04 ci: use latest Ubuntu release ci: remove stale code for Azure Pipelines .github/workflows/main.yml | 78 ++++++++++++++++++++++----------------------- .gitlab-ci.yml | 19 ++++++----- ci/install-dependencies.sh | 6 ++-- ci/lib.sh | 34 +++----------------- ci/print-test-failures.sh | 5 --- t/t0060-path-utils.sh | 10 +++--- t/t7422-submodule-output.sh | 43 ++++++++++++++++++++++--- 7 files changed, 100 insertions(+), 95 deletions(-) Range-diff versus v3: 1: 324b174988 = 1: ca4dc636aa t0060: fix EBUSY in MinGW when setting up runtime prefix 2: cc095aa2b1 ! 2: d41c00feb1 t7422: fix flaky test caused by buffered stdout @@ Commit message error: last command exited with $?=1 not ok 18 - git submodule status --recursive propagates SIGPIPE - The issue is caused by us using grep(1) to terminate the pipe on the - first matching line in the recursing git-submodule(1) process. Standard - streams are typically buffered though, so this condition is racy and may - cause us to terminate the pipe after git-submodule(1) has already - exited, and in that case we wouldn't see the expected signal. + The issue is caused by a race between git-submodule(1) and grep(1): + + 1. git-submodule(1) (or its child process) writes the first X/S line + we're trying to match. + + 2. grep(1) matches the line. + + 3a. grep(1) exits, closing the pipe. + + 3b. git-submodule(1) (or its child process) writes the rest of its + lines. + + Steps 3a and 3b happen at the same time without any guarantees. If 3a + happens first, we get SIGPIPE. Otherwise, we don't and the test fails. Fix the issue by generating a couple thousand nested submodules and matching on the first nested submodule. This ensures that the recursive git-submodule(1) process completely fills its stdout buffer, which makes subsequent writes block until the downstream consumer of the pipe either - fully drains it or closes it. + reads more or closes it. To verify that this works as expected one can apply the following patch to the preimage of this commit, which used to reliably trigger the race: 3: 73c98e7628 = 3: 6593e3307c github: adapt containerized jobs to be rootless 4: 4d8f2bdce7 = 4: 9997698c6e github: convert all Linux jobs to be containerized 5: 4a987aa42e = 5: 65797c51dc github: simplify computation of the job's distro 6: bd44700668 = 6: 687ae0c3f2 gitlab-ci: remove the "linux-old" job 7: e095c6757c = 7: 974229776b gitlab-ci: add linux32 job testing against i386 8: f885740877 = 8: 112ea61a6b ci: stop special-casing for Ubuntu 16.04 9: 17b19dc51e = 9: 465cd85898 ci: use latest Ubuntu release 10: 95ce4406c7 = 10: d671ee1f7f ci: remove stale code for Azure Pipelines --- base-commit: 1b4e9a5f8b5f048972c21fe8acafe0404096f694 change-id: 20250103-b4-pks-ci-fixes-2d0a23fb5c78