Re: [PATCH 0/6] Allow signals for IO threads

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/26/21 9:04 AM, Stefan Metzmacher wrote:
> 
> Am 26.03.21 um 15:53 schrieb Jens Axboe:
>> On 3/26/21 8:45 AM, Stefan Metzmacher wrote:
>>> Am 26.03.21 um 15:43 schrieb Stefan Metzmacher:
>>>> Am 26.03.21 um 15:38 schrieb Jens Axboe:
>>>>> On 3/26/21 7:59 AM, Jens Axboe wrote:
>>>>>> On 3/26/21 7:54 AM, Jens Axboe wrote:
>>>>>>>> The KILL after STOP deadlock still exists.
>>>>>>>
>>>>>>> In which tree? Sounds like you're still on the old one with that
>>>>>>> incremental you sent, which wasn't complete.
>>>>>>>
>>>>>>>> Does io_wq_manager() exits without cleaning up on SIGKILL?
>>>>>>>
>>>>>>> No, it should kill up in all cases. I'll try your stop + kill, I just
>>>>>>> tested both of them separately and didn't observe anything. I also ran
>>>>>>> your io_uring-cp example (and found a bug in the example, fixed and
>>>>>>> pushed), fwiw.
>>>>>>
>>>>>> I can reproduce this one! I'll take a closer look.
>>>>>
>>>>> OK, that one is actually pretty straight forward - we rely on cleaning
>>>>> up on exit, but for fatal cases, get_signal() will call do_exit() for us
>>>>> and never return. So we might need a special case in there to deal with
>>>>> that, or some other way of ensuring that fatal signal gets processed
>>>>> correctly for IO threads.
>>>>
>>>> And if (fatal_signal_pending(current)) doesn't prevent get_signal() from being called?
>>>
>>> Ah, we're still in the first get_signal() from SIGSTOP, correct?
>>
>> Yes exactly, we're waiting in there being stopped. So we either need to
>> check to something ala:
>>
>> relock:
>> +	if (current->flags & PF_IO_WORKER && fatal_signal_pending(current))
>> +		return false;
>>
>> to catch it upfront and from the relock case, or add:
>>
>> 	fatal:
>> +		if (current->flags & PF_IO_WORKER)
>> +			return false;
>>
>> to catch it in the fatal section.
>>
> 
> Or something like io_uring_files_cancel()
> 
> Maybe change current->pf_io_worker with a generic current->io_thread
> structure which, has exit hooks, as well as
> io_wq_worker_sleeping() and io_wq_worker_running().
> 
> Maybe create_io_thread would take such an structure
> as argument instead of a single function pointer.
> 
> struct io_thread_description {
> 	const char *name;
> 	int (*thread_fn)(struct io_thread_description *);
> 	void (*sleeping_fn)((struct io_thread_description *);
> 	void (*running_fn)((struct io_thread_description *);
> 	void (*exit_fn)((struct io_thread_description *);
> };
> 
> And then
> struct io_wq_manager {
> 	struct io_thread_description description;
> 	... manager specific stuff...
> };

I did consider something like that, but seems a bit over-engineered
just for catching this case. And any kind of logic for PF_EXITING
ends up being a bit tricky for cancelations.

We can look into doing that for 5.13 potentially.

-- 
Jens Axboe




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux