On Mon, Jan 11, 2016 at 01:29:12PM +0000, Dave Gordon wrote: > On 11/01/16 12:34, Chris Wilson wrote: > >On Mon, Jan 11, 2016 at 12:25:07PM +0000, Dave Gordon wrote: > >>On 11/01/16 09:06, Daniel Vetter wrote: > >>>On Mon, Jan 11, 2016 at 08:54:59AM +0000, Chris Wilson wrote: > >>>>On Mon, Jan 11, 2016 at 08:57:33AM +0100, Daniel Vetter wrote: > >>>>>On Fri, Jan 08, 2016 at 08:44:29AM +0000, Chris Wilson wrote: > >>>>>>Some stress tests create both the signal helper and a lot of competing > >>>>>>processes. In these tests, the parent is just waiting upon the children, > >>>>>>and the intention is not to keep waking up the waiting parent, but to > >>>>>>keep interrupting the children (as we hope to trigger races in our > >>>>>>kernel code). kill(-pid) sends the signal to all members of the process > >>>>>>group, not just the target pid. > >>>>> > >>>>>I don't really have any clue about unix pgroups, but the -pid disappeared > >>>>>compared to the previous version. > >>>> > >>>>-getppid(). > >>>> > >>>>I felt it was clearer to pass along the "negative pid = process group" > >>>>after setting up the process group. > >>> > >>>Oh, I was blind ... Yeah looks better, but please add a bigger comment > >>>around that code explaining why we need a group and why we use SIG_CONT. > >>>With that acked-by: me. > >>> > >>>Cheers, Daniel > >>> > >>>>>>We also switch from using SIGUSR1 to SIGCONT to paper over a race > >>>>>>condition when forking children that saw the default signal action being > >>>>>>run (and thus killing the child). > >>>>> > >>>>>I thought I fixed that race by first installing the new signal handler, > >>>>>then forking. Ok, rechecked and it's the SYS_getpid stuff, so another > >>>>>race. Still I thought signal handlers would survive a fork? > >>>> > >>>>So did irc. They didn't appear to as the children would sporadically > >>>>die with SIGUSR1. > >>> > >>>Could be that libc is doing something funny, iirc they have piles of fork > >>>helpers to make fork more reliable (breaking locks and stuff like that), > >>>but then in turn break the abstraction. > >>>-Daniel > >> > >>You could use killpg(pgrp, sig) rather than kill(), just to make it > >>clearer that the target is a process group, rather than people > >>having to know about the "negative pid" semantics. > >> > >>I don't think SIGCHLD is a good idea; it has kernel-defined > >>semantics beyond just sending a signal. And it may not be delivered > >>at all, if the disposition is not "caught". SIGUSR1 was the right > >>thing, really; so it would be better to work out how to make that > >>work properly, rather than change to a different one. > > > >SIGCONT not SIGCHLD. And the deposition is supposed to be fully under > >our control any way. > > Oops, yes, I meant SIGCONT has kernel-defined semantics, etc ... > > Catching SIGCONT is ... unusual. Because you can't catch SIGSTOP, > you don't normally have any reason to catch SIGCONT. > > Actually, SIGCONT is even more bizarre than SIGCHLD as sending a > SIGCONT to a process can result in a SIGCHLD being sent to its > parent. Really? That's a nuiscance but nothing more. I'm only trying to paper over a bug here :) > >>Signal handlers are (supposed to be) inherited across fork(); signal > >>disposition is also inherited, and the set of pending signals of a > >>new process is (supposed to be) empty. OTOH a signal can be > >>delivered to the child before it returns from the fork(), which may > >>be a bit surprising. > >> > >>I think the safest way to avoid unexpected signals around a fork() is: > >> > >>parent calls sigprocmask() to block all interesting signals > >>parent calls fork() --> child inherits mask > >>parent calls sigprocmask() to restore the previous mask > > > >I tried that. > >-Chris > > Are we using signal(2) to install the handlers? 'Cos that's archaic > and has known unfixable race conditions. The Linux kernel supplies > SysV signal semantics by default, which means the disposition gets > reset before the handler is called, so a double signal kills the > program. The glibc signal(3) wrapper provides BSD semantics which > are slightly less problematic; but libc5 signal(3) implements SysV. We are using both, but in for the sighelper interrupt we are using signal - but a long time before we fork the test children. Worth a shot as much as anything else... -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx