On Fri, Dec 07, 2018 at 12:10:05AM +0100, SZEDER Gábor wrote: > On Wed, Dec 05, 2018 at 04:56:21PM -0500, Jeff King wrote: > > Could we just kill them all? > > > > I guess it's a little tricky, because $! is going to give us the pid of > > each subshell. We actually want to kill the test sub-process. That takes > > a few contortions, but the output is nice (every sub-job immediately > > says "ABORTED ...", and the final process does not exit until the whole > > tree is done): > > It's trickier than that: > > - Nowadays our test lib translates SIGINT to exit, but not TERM (or > HUP, in case a user decides to close the terminal window), which > means that if the test runs any daemons in the background, then > those won't be killed when the stress test is aborted. > > This is easy enough to address, and I've been meaning to do this > in an other patch series anyway. Yeah, trapping on more signals seems reasonable (or just propagating INT instead of TERM). I'm more worried about this one: > - With the (implied) '--verbose-log' the process killed in the > background subshell's SIGTERM handler is not the actual test > process, because '--verbose-log' runs the test again in a piped > subshell to save its output: Hmm, yeah. The patch I sent earlier already propagates through the subshell, so this is really just another layer there. I.e., would it be reasonable when we are using --verbose-log (or --tee) for the parent process to propagate signals down to child it spawned? That would fix this case, and it would make things more predictable in general (e.g., a user who manually runs kill). It does feel like a lot of boilerplate to be propagating these signals manually. I think the "right" Unix-y solution here is process groups: each sub-job should get its own group, and then the top-level runner script can kill the whole group. We may run into portability issues, though (setsid is in util-linux, but I don't know if there's an equivalent elsewhere; a small perl or C helper could do it, but I have no idea what that would mean on Windows). > (GIT_TEST_TEE_STARTED=done ${TEST_SHELL_PATH} "$0" "$@" 2>&1; > echo $? >"$TEST_RESULTS_BASE.exit") | tee -a "$GIT_TEST_TEE_OUTPUT_FILE" > > That 'kill' kills the "outer" shell process. > And though I get "ABORTED..." immediately and I get back my > prompt right away, the commands involved in the above construct > are still running in the background: > > $ ./t3903-stash.sh --stress=1 > ^CABORTED 0.0 > $ ps a |grep t3903 > 1280 pts/17 S 0:00 /bin/sh ./t3903-stash.sh --stress=1 > 1281 pts/17 S 0:00 tee -a <...>/test-results/t3903-stash.stress-0.out > 1282 pts/17 S 0:00 /bin/sh ./t3903-stash.sh --stress=1 > 4173 pts/17 S+ 0:00 grep t3903 > > I see this behavior with all shells I tried. > I haven't yet started to think it through why this happens :) I think you get ABORTED because we are just watching for the parent-process to exit (and reporting its status). The fact that that the subshell (and the tee) are still running is not important. That all makes sense to me. > Not implying '--verbose-log' but redirecting the test script's > output, like you did in your 'stress' script, seems to work in > dash, ksh, and ks93, but not in Bash v4.3 or later, where, for > whatever reason, the subshell get SIGINT before the SIGTERM would > arrive. > While we could easily handle SIGINT in the subshell with the same > signal handler as SIGTERM, it bothers me that something > fundamental is wrong here. Yeah, I don't understand the SIGINT behavior you're seeing with bash. I can't replicate it with bash 4.4.23 (from Debian unstable). -Peff