Re: [PATCH igt] core/sighelper: Interrupt everyone in the process group

Dave Gordon <david.s.gordon@xxxxxxxxx> · Mon, 11 Jan 2016 13:29:12 +0000

On 11/01/16 12:34, Chris Wilson wrote:
On Mon, Jan 11, 2016 at 12:25:07PM +0000, Dave Gordon wrote:
On 11/01/16 09:06, Daniel Vetter wrote:
On Mon, Jan 11, 2016 at 08:54:59AM +0000, Chris Wilson wrote:
On Mon, Jan 11, 2016 at 08:57:33AM +0100, Daniel Vetter wrote:
On Fri, Jan 08, 2016 at 08:44:29AM +0000, Chris Wilson wrote:
Some stress tests create both the signal helper and a lot of competing
processes. In these tests, the parent is just waiting upon the children,
and the intention is not to keep waking up the waiting parent, but to
keep interrupting the children (as we hope to trigger races in our
kernel code). kill(-pid) sends the signal to all members of the process
group, not just the target pid.

I don't really have any clue about unix pgroups, but the -pid disappeared
compared to the previous version.

-getppid().

I felt it was clearer to pass along the "negative pid = process group"
after setting up the process group.

Oh, I was blind ... Yeah looks better, but please add a bigger comment
around that code explaining why we need a group and why we use SIG_CONT.
With that acked-by: me.

Cheers, Daniel

We also switch from using SIGUSR1 to SIGCONT to paper over a race
condition when forking children that saw the default signal action being
run (and thus killing the child).

I thought I fixed that race by first installing the new signal handler,
then forking. Ok, rechecked and it's the SYS_getpid stuff, so another
race. Still I thought signal handlers would survive a fork?

So did irc. They didn't appear to as the children would sporadically
die with SIGUSR1.

Could be that libc is doing something funny, iirc they have piles of fork
helpers to make fork more reliable (breaking locks and stuff like that),
but then in turn break the abstraction.
-Daniel

You could use killpg(pgrp, sig) rather than kill(), just to make it
clearer that the target is a process group, rather than people
having to know about the "negative pid" semantics.

I don't think SIGCHLD is a good idea; it has kernel-defined
semantics beyond just sending a signal. And it may not be delivered
at all, if the disposition is not "caught". SIGUSR1 was the right
thing, really; so it would be better to work out how to make that
work properly, rather than change to a different one.

SIGCONT not SIGCHLD. And the deposition is supposed to be fully under
our control any way.

Oops, yes, I meant SIGCONT has kernel-defined semantics, etc ...

Catching SIGCONT is ... unusual. Because you can't catch SIGSTOP, you 
don't normally have any reason to catch SIGCONT.

Actually, SIGCONT is even more bizarre than SIGCHLD as sending a SIGCONT 
to a process can result in a SIGCHLD being sent to its parent.

Signal handlers are (supposed to be) inherited across fork(); signal
disposition is also inherited, and the set of pending signals of a
new process is (supposed to be) empty. OTOH a signal can be
delivered to the child before it returns from the fork(), which may
be a bit surprising.

I think the safest way to avoid unexpected signals around a fork() is:

parent calls sigprocmask() to block all interesting signals
parent calls fork() --> child inherits mask
parent calls sigprocmask() to restore the previous mask

I tried that.
-Chris

Are we using signal(2) to install the handlers? 'Cos that's archaic and 
has known unfixable race conditions. The Linux kernel supplies SysV 
signal semantics by default, which means the disposition gets reset 
before the handler is called, so a double signal kills the program. The 
glibc signal(3) wrapper provides BSD semantics which are slightly less 
problematic; but libc5 signal(3) implements SysV.

The proper answer is usually to use sigaction(2) instead. Then the race 
conditions don't (shouldn't) occur, at least if the user code gets all 
the options right.

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx