Re: [PATCH 09/18] io-wq: fork worker threads from original task

Stefan Metzmacher <metze@xxxxxxxxx> · Thu, 4 Mar 2021 18:09:28 +0100

Am 04.03.21 um 17:42 schrieb Jens Axboe:
> On 3/4/21 9:13 AM, Stefan Metzmacher wrote:
>>
>> Am 04.03.21 um 14:19 schrieb Stefan Metzmacher:
>>> Hi Jens,
>>>
>>>>> Can you please explain why CLONE_SIGHAND is used here?
>>>>
>>>> We can't have CLONE_THREAD without CLONE_SIGHAND... The io-wq workers
>>>> don't really care about signals, we don't use them internally.
>>>
>>> I'm 100% sure, but I heard rumors that in some situations signals get
>>> randomly delivered to any thread of a userspace process.
>>
>> Ok, from task_struct:
>>
>>         /* Signal handlers: */
>>         struct signal_struct            *signal;
>>         struct sighand_struct __rcu             *sighand;
>>         sigset_t                        blocked;
>>         sigset_t                        real_blocked;
>>         /* Restored if set_restore_sigmask() was used: */
>>         sigset_t                        saved_sigmask;
>>         struct sigpending               pending;
>>
>> The signal handlers are shared, but 'blocked' is per thread/task.
>>
>>> My fear was that the related logic may select a kernel thread if they
>>> share the same signal handlers.
>>
>> I found the related logic in the interaction between
>> complete_signal() and wants_signal().
>>
>> static inline bool wants_signal(int sig, struct task_struct *p)
>> {
>>         if (sigismember(&p->blocked, sig))
>>                 return false;
>>
>> ...
>>
>> Would it make sense to set up task->blocked to block all signals?
>>
>> Something like this:
>>
>> --- a/fs/io-wq.c
>> +++ b/fs/io-wq.c
>> @@ -611,15 +611,15 @@ pid_t io_wq_fork_thread(int (*fn)(void *), void *arg)
>>  {
>>         unsigned long flags = CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|
>>                                 CLONE_IO|SIGCHLD;
>> -       struct kernel_clone_args args = {
>> -               .flags          = ((lower_32_bits(flags) | CLONE_VM |
>> -                                   CLONE_UNTRACED) & ~CSIGNAL),
>> -               .exit_signal    = (lower_32_bits(flags) & CSIGNAL),
>> -               .stack          = (unsigned long)fn,
>> -               .stack_size     = (unsigned long)arg,
>> -       };
>> +       sigset_t mask, oldmask;
>> +       pid_t pid;
>>
>> -       return kernel_clone(&args);
>> +       sigfillset(&mask);
>> +       sigprocmask(SIG_BLOCK, &mask, &oldmask);
>> +       pid = kernel_thread(fn, arg, flags);
>> +       sigprocmask(SIG_SETMASK, &oldmask, NULL);
>> +
>> +       return ret;
>>  }
>>
>> I think using kernel_thread() would be a good simplification anyway.
> 
> I like this approach, we're really not interested in signals for those
> threads, and this makes it explicit. Ditto on just using the kernel_thread()
> helper, looks fine too. I'll run this through the testing. Do you want to
> send this as a "real" patch, or should I just attribute you in the commit
> message?

You can do the patch, it was mostly an example.
I'm not sure if sigprocmask() is the correct function here.

Or if we better use something like this:

        set_restore_sigmask();
        current->saved_sigmask = current->blocked;
        set_current_blocked(&kmask);
        pid = kernel_thread(fn, arg, flags);
        restore_saved_sigmask();

I think current->flags |= PF_IO_WORKER;
should also move into io_wq_fork_thread()
and maybe passed differently to kernel_clone() that
abusing current->flags (where current is not an IO_WORKER),
so in general I think it would be better to handle all this within kernel_clone()
natively, rather than temporary modifying current->flags or current->blocked.

What there be problems with handling everything in copy_process() and related helpers
and avoid the CLONE_SIGHAND behavior for PF_IO_WORKER tasks.

kernel_clone_args could get an unsigned int task_flags to fill p->flags in copy_process().
Then kernel_thread() could also get a task_flags argument and in all other places will use
fill that with current->flags.

metze