Ivan Delalande <colona@xxxxxxxxxx> writes: > Hi Eric, > > On Thu, Feb 07, 2019 at 11:13:59PM -0600, Eric W. Biederman wrote: >> I just noticed this. From my patch queue that I intend to send to >> Linus tomorrow. I think this change fixes your issue of getting >> the SIGSEGV instead of the already pending fatal signal. >> >> So I think this fixes your issue without any other code changes. >> Ivan can you verify that the patch below is enough? > > I was having issues with just this patch applied on top of v5.0-rc5 or > the latest master: defunct processes accumulating, exiting processes > that would hang forever, and some kernel functions eating all the CPU > (setup_sigcontext, common_interrupt, __clear_user, do_signal…). > > But using your user-namespace.git/for-linus worked great and I've been > running my reproducer for a few hours now without issue. I'll probably > keep it running over the week-end as it has been unreliable at times, > but it looks promising so far. Sounds. Thank you for finding my tree, and thank you for testing. > A difference I've noticed with your tree (unrelated to my issue here but > that you may want to look at) is when I run my reproducer under > strace -f, I'm now getting quite a lot of "Exit of unknown pid 12345 > ignored" warnings from strace, which I've never seen with mainline. > My reproducer simply fork-exec tail processes in a loop, and tries to > sigkill them in the parent with a variable delay. What was your base tree? My best guess is that your SIGKILL is getting there before strace realizes the process has been forked. If we can understand the race it is probably worth fixing. Any chance you can post your reproducer. It is possible it is my most recent fixes, or it is possible something changed from the tree you were testing and the tree you are working on. Eric