Re: [PATCH 09/18] io-wq: fork worker threads from original task

Jens Axboe <axboe@xxxxxxxxx> · Thu, 4 Mar 2021 10:32:20 -0700

On 3/4/21 10:09 AM, Stefan Metzmacher wrote:
> 
> Am 04.03.21 um 17:42 schrieb Jens Axboe:
>> On 3/4/21 9:13 AM, Stefan Metzmacher wrote:
>>>
>>> Am 04.03.21 um 14:19 schrieb Stefan Metzmacher:
>>>> Hi Jens,
>>>>
>>>>>> Can you please explain why CLONE_SIGHAND is used here?
>>>>>
>>>>> We can't have CLONE_THREAD without CLONE_SIGHAND... The io-wq workers
>>>>> don't really care about signals, we don't use them internally.
>>>>
>>>> I'm 100% sure, but I heard rumors that in some situations signals get
>>>> randomly delivered to any thread of a userspace process.
>>>
>>> Ok, from task_struct:
>>>
>>>         /* Signal handlers: */
>>>         struct signal_struct            *signal;
>>>         struct sighand_struct __rcu             *sighand;
>>>         sigset_t                        blocked;
>>>         sigset_t                        real_blocked;
>>>         /* Restored if set_restore_sigmask() was used: */
>>>         sigset_t                        saved_sigmask;
>>>         struct sigpending               pending;
>>>
>>> The signal handlers are shared, but 'blocked' is per thread/task.
>>>
>>>> My fear was that the related logic may select a kernel thread if they
>>>> share the same signal handlers.
>>>
>>> I found the related logic in the interaction between
>>> complete_signal() and wants_signal().
>>>
>>> static inline bool wants_signal(int sig, struct task_struct *p)
>>> {
>>>         if (sigismember(&p->blocked, sig))
>>>                 return false;
>>>
>>> ...
>>>
>>> Would it make sense to set up task->blocked to block all signals?
>>>
>>> Something like this:
>>>
>>> --- a/fs/io-wq.c
>>> +++ b/fs/io-wq.c
>>> @@ -611,15 +611,15 @@ pid_t io_wq_fork_thread(int (*fn)(void *), void *arg)
>>>  {
>>>         unsigned long flags = CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|
>>>                                 CLONE_IO|SIGCHLD;
>>> -       struct kernel_clone_args args = {
>>> -               .flags          = ((lower_32_bits(flags) | CLONE_VM |
>>> -                                   CLONE_UNTRACED) & ~CSIGNAL),
>>> -               .exit_signal    = (lower_32_bits(flags) & CSIGNAL),
>>> -               .stack          = (unsigned long)fn,
>>> -               .stack_size     = (unsigned long)arg,
>>> -       };
>>> +       sigset_t mask, oldmask;
>>> +       pid_t pid;
>>>
>>> -       return kernel_clone(&args);
>>> +       sigfillset(&mask);
>>> +       sigprocmask(SIG_BLOCK, &mask, &oldmask);
>>> +       pid = kernel_thread(fn, arg, flags);
>>> +       sigprocmask(SIG_SETMASK, &oldmask, NULL);
>>> +
>>> +       return ret;
>>>  }
>>>
>>> I think using kernel_thread() would be a good simplification anyway.
>>
>> I like this approach, we're really not interested in signals for those
>> threads, and this makes it explicit. Ditto on just using the kernel_thread()
>> helper, looks fine too. I'll run this through the testing. Do you want to
>> send this as a "real" patch, or should I just attribute you in the commit
>> message?
> 
> You can do the patch, it was mostly an example.
> I'm not sure if sigprocmask() is the correct function here.
> 
> Or if we better use something like this:
> 
>         set_restore_sigmask();
>         current->saved_sigmask = current->blocked;
>         set_current_blocked(&kmask);
>         pid = kernel_thread(fn, arg, flags);
>         restore_saved_sigmask();

Might be cleaner, and allows fatal signals.

> I think current->flags |= PF_IO_WORKER;
> should also move into io_wq_fork_thread()
> and maybe passed differently to kernel_clone() that
> abusing current->flags (where current is not an IO_WORKER),
> so in general I think it would be better to handle all this within kernel_clone()
> natively, rather than temporary modifying current->flags or current->blocked.
> 
> What there be problems with handling everything in copy_process() and related helpers
> and avoid the CLONE_SIGHAND behavior for PF_IO_WORKER tasks.
> 
> kernel_clone_args could get an unsigned int task_flags to fill p->flags in copy_process().
> Then kernel_thread() could also get a task_flags argument and in all other places will use
> fill that with current->flags.

I agree there are cleanups possible there, but I'd rather defer those until all
the dust has settled.

-- 
Jens Axboe