Re: [PATCH 0/6] Allow signals for IO threads

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 26.03.21 um 16:10 schrieb Jens Axboe:
> On 3/26/21 9:08 AM, Stefan Metzmacher wrote:
>> Am 26.03.21 um 15:55 schrieb Jens Axboe:
>>> On 3/26/21 8:53 AM, Jens Axboe wrote:
>>>> On 3/26/21 8:45 AM, Stefan Metzmacher wrote:
>>>>> Am 26.03.21 um 15:43 schrieb Stefan Metzmacher:
>>>>>> Am 26.03.21 um 15:38 schrieb Jens Axboe:
>>>>>>> On 3/26/21 7:59 AM, Jens Axboe wrote:
>>>>>>>> On 3/26/21 7:54 AM, Jens Axboe wrote:
>>>>>>>>>> The KILL after STOP deadlock still exists.
>>>>>>>>>
>>>>>>>>> In which tree? Sounds like you're still on the old one with that
>>>>>>>>> incremental you sent, which wasn't complete.
>>>>>>>>>
>>>>>>>>>> Does io_wq_manager() exits without cleaning up on SIGKILL?
>>>>>>>>>
>>>>>>>>> No, it should kill up in all cases. I'll try your stop + kill, I just
>>>>>>>>> tested both of them separately and didn't observe anything. I also ran
>>>>>>>>> your io_uring-cp example (and found a bug in the example, fixed and
>>>>>>>>> pushed), fwiw.
>>>>>>>>
>>>>>>>> I can reproduce this one! I'll take a closer look.
>>>>>>>
>>>>>>> OK, that one is actually pretty straight forward - we rely on cleaning
>>>>>>> up on exit, but for fatal cases, get_signal() will call do_exit() for us
>>>>>>> and never return. So we might need a special case in there to deal with
>>>>>>> that, or some other way of ensuring that fatal signal gets processed
>>>>>>> correctly for IO threads.
>>>>>>
>>>>>> And if (fatal_signal_pending(current)) doesn't prevent get_signal() from being called?
>>>>>
>>>>> Ah, we're still in the first get_signal() from SIGSTOP, correct?
>>>>
>>>> Yes exactly, we're waiting in there being stopped. So we either need to
>>>> check to something ala:
>>>>
>>>> relock:
>>>> +	if (current->flags & PF_IO_WORKER && fatal_signal_pending(current))
>>>> +		return false;
>>>>
>>>> to catch it upfront and from the relock case, or add:
>>>>
>>>> 	fatal:
>>>> +		if (current->flags & PF_IO_WORKER)
>>>> +			return false;
>>>>
>>>> to catch it in the fatal section.
>>>
>>> Can you try this? Not crazy about adding a special case, but I don't
>>> think there's any way around this one. And should be pretty cheap, as
>>> we're already pulling in ->flags right above anyway.
>>>
>>> diff --git a/kernel/signal.c b/kernel/signal.c
>>> index 5ad8566534e7..5b75fbe3d2d6 100644
>>> --- a/kernel/signal.c
>>> +++ b/kernel/signal.c
>>> @@ -2752,6 +2752,15 @@ bool get_signal(struct ksignal *ksig)
>>>  		 */
>>>  		current->flags |= PF_SIGNALED;
>>>  
>>> +		/*
>>> +		 * PF_IO_WORKER threads will catch and exit on fatal signals
>>> +		 * themselves. They have cleanup that must be performed, so
>>> +		 * we cannot call do_exit() on their behalf. coredumps also
>>> +		 * do not apply to them.
>>> +		 */
>>> +		if (current->flags & PF_IO_WORKER)
>>> +			return false;
>>> +
>>>  		if (sig_kernel_coredump(signr)) {
>>>  			if (print_fatal_signals)
>>>  				print_fatal_signal(ksig->info.si_signo);
>>>
>>
>> I guess not before next week, but if it resolves the problem for you,
>> I guess it would be good to get this into rc5.
> 
> It does, I pushed out a new branch. I'll send out a v2 series in a bit.

Great, thanks!

Any chance to get the "cmdline" hiding included?

metze




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux