On Mon, Jun 03, 2024 at 06:22:32PM +0100, Mark Brown wrote: > On Mon, Jun 03, 2024 at 05:27:52PM +0100, Mark Brown wrote: > > On Mon, May 27, 2024 at 08:07:40PM +0100, Mark Brown wrote: > > > > This is now in mainline and appears to be causing several tests (at > > > least the ptrace vmaccess global_attach test on arm64, possibly also > > > some of the epoll tests) that previously were timed out by the harness > > > to to hang instead. A bisect seems to point at this patch in > > > particular, there was a bunch of discussion of the fallout of these > > > patches but I'm afraid I lost track of it, is there something in flight > > > for this? -next is affected as well from the looks of it. Thanks for the heads up. I warned about not being able to test everything when fixing kselftest last time, but nobody show up. Is there an easy way to run most kselftests? We really need a (more accessible) CI... > > > FWIW I'm still seeing this on -rc2... > > AFAICT this is due to the switch to using clone3() with CLONE_VFORK I guess it started with the previous vfork() that was later replaced with CLONE_VFORK. > to start the test which means we never even call alarm() to set up the > timeout for the test, let alone have the signal for it delivered. I'm a > confused about how this could ever work, with clone_vfork() the parent > shouldn't run until the child execs (which won't happen here) or exits. > Since we don't call alarm() until after we started the child we never > actually get that far, but even if we reorder things we'll not get the > signal for the alarm if the child messes up since the parent is > suspended. > > I'm not clear what the original race being fixed here was but it seems > like we should revert this since the timeout functionality is pretty > important? It took me a while to fix all the previous issues and it would be much easier to just fix this issue too. I'm working on it.