Am 04.03.21 um 14:19 schrieb Stefan Metzmacher: > Hi Jens, > >>> Can you please explain why CLONE_SIGHAND is used here? >> >> We can't have CLONE_THREAD without CLONE_SIGHAND... The io-wq workers >> don't really care about signals, we don't use them internally. > > I'm 100% sure, but I heard rumors that in some situations signals get > randomly delivered to any thread of a userspace process. Ok, from task_struct: /* Signal handlers: */ struct signal_struct *signal; struct sighand_struct __rcu *sighand; sigset_t blocked; sigset_t real_blocked; /* Restored if set_restore_sigmask() was used: */ sigset_t saved_sigmask; struct sigpending pending; The signal handlers are shared, but 'blocked' is per thread/task. > My fear was that the related logic may select a kernel thread if they > share the same signal handlers. I found the related logic in the interaction between complete_signal() and wants_signal(). static inline bool wants_signal(int sig, struct task_struct *p) { if (sigismember(&p->blocked, sig)) return false; ... Would it make sense to set up task->blocked to block all signals? Something like this: --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -611,15 +611,15 @@ pid_t io_wq_fork_thread(int (*fn)(void *), void *arg) { unsigned long flags = CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD| CLONE_IO|SIGCHLD; - struct kernel_clone_args args = { - .flags = ((lower_32_bits(flags) | CLONE_VM | - CLONE_UNTRACED) & ~CSIGNAL), - .exit_signal = (lower_32_bits(flags) & CSIGNAL), - .stack = (unsigned long)fn, - .stack_size = (unsigned long)arg, - }; + sigset_t mask, oldmask; + pid_t pid; - return kernel_clone(&args); + sigfillset(&mask); + sigprocmask(SIG_BLOCK, &mask, &oldmask); + pid = kernel_thread(fn, arg, flags); + sigprocmask(SIG_SETMASK, &oldmask, NULL); + + return ret; } I think using kernel_thread() would be a good simplification anyway. sig_task_ignored() has some PF_IO_WORKER logic. Or is there any PF_IO_WORKER related logic that prevents an io_wq thread to be excluded in complete_signal(). Or PF_IO_WORKER would teach kernel_clone to ignore CLONE_SIGHAND and create a fresh handler and alter the copy_signal() and copy_sighand() checks... metze