On 3/25/21 2:40 PM, Jens Axboe wrote: > On 3/25/21 2:12 PM, Linus Torvalds wrote: >> On Thu, Mar 25, 2021 at 12:42 PM Linus Torvalds >> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: >>> >>> On Thu, Mar 25, 2021 at 12:38 PM Linus Torvalds >>> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: >>>> >>>> I don't know what the gdb logic is, but maybe there's some other >>>> option that makes gdb not react to them? >>> >>> .. maybe we could have a different name for them under the task/ >>> subdirectory, for example (not just the pid)? Although that probably >>> messes up 'ps' too.. >> >> Actually, maybe the right model is to simply make all the io threads >> take signals, and get rid of all the special cases. >> >> Sure, the signals will never be delivered to user space, but if we >> >> - just made the thread loop do "get_signal()" when there are pending signals >> >> - allowed ptrace_attach on them >> >> they'd look pretty much like regular threads that just never do the >> user-space part of signal handling. >> >> The whole "signals are very special for IO threads" thing has caused >> so many problems, that maybe the solution is simply to _not_ make them >> special? > > Just to wrap up the previous one, yes it broke all sorts of things to > make the 'tid' directory different. They just end up being hidden anyway > through that, for both ps and top. > > Yes, I do think that maybe it's better to just embrace maybe just > embrace the signals, and have everything just work by default. It's > better than continually trying to make the threads special. I'll see > if there are some demons lurking down that path. In the spirit of "let's just try it", I ran with the below patch. With that, I can gdb attach just fine to a test case that creates an io_uring and a regular thread with pthread_create(). The regular thread uses the ring, so you end up with two iou-mgr threads. Attach: [root@archlinux ~]# gdb -p 360 [snip gdb noise] Attaching to process 360 [New LWP 361] [New LWP 362] [New LWP 363] warning: Selected architecture i386:x86-64 is not compatible with reported target architecture i386 warning: Architecture rejected target-supplied description Error while reading shared library symbols for /usr/lib/libpthread.so.0: Cannot find user-level thread for LWP 363: generic error 0x00007f7aa526e125 in clock_nanosleep@GLIBC_2.2.5 () from /usr/lib/libc.so.6 (gdb) info threads Id Target Id Frame * 1 LWP 360 "io_uring" 0x00007f7aa526e125 in clock_nanosleep@GLIBC_2.2.5 () from /usr/lib/libc.so.6 2 LWP 361 "iou-mgr-360" 0x0000000000000000 in ?? () 3 LWP 362 "io_uring" 0x00007f7aa52a0a9d in syscall () from /usr/lib/libc.so.6 4 LWP 363 "iou-mgr-362" 0x0000000000000000 in ?? () (gdb) thread 2 [Switching to thread 2 (LWP 361)] #0 0x0000000000000000 in ?? () (gdb) bt #0 0x0000000000000000 in ?? () Backtrace stopped: Cannot access memory at address 0x0 (gdb) cont Continuing. ^C Thread 1 "io_uring" received signal SIGINT, Interrupt. [Switching to LWP 360] 0x00007f7aa526e125 in clock_nanosleep@GLIBC_2.2.5 () from /usr/lib/libc.so.6 (gdb) q A debugging session is active. Inferior 1 [process 360] will be detached. Quit anyway? (y or n) y Detaching from program: /root/git/fio/t/io_uring, process 360 [Inferior 1 (process 360) detached] The iou-mgr-x threads are stopped just fine, gdb obviously can't get any real info out of them. But it works... Regular test cases work fine too, just a sanity check. Didn't expect them not to. Only thing that I dislike a bit, but I guess that's just a Linuxism, is that if can now kill an io_uring owning task by sending a signal to one of its IO thread workers. diff --git a/fs/io-wq.c b/fs/io-wq.c index b7c1fa932cb3..2dbdc552f3ba 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -505,8 +505,14 @@ static int io_wqe_worker(void *data) ret = schedule_timeout(WORKER_IDLE_TIMEOUT); if (try_to_freeze() || ret) continue; - if (fatal_signal_pending(current)) - break; + if (signal_pending(current)) { + struct ksignal ksig; + + if (fatal_signal_pending(current)) + break; + get_signal(&ksig); + continue; + } /* timed out, exit unless we're the fixed worker */ if (test_bit(IO_WQ_BIT_EXIT, &wq->state) || !(worker->flags & IO_WORKER_F_FIXED)) @@ -715,8 +721,15 @@ static int io_wq_manager(void *data) io_wq_check_workers(wq); schedule_timeout(HZ); try_to_freeze(); - if (fatal_signal_pending(current)) - set_bit(IO_WQ_BIT_EXIT, &wq->state); + if (signal_pending(current)) { + struct ksignal ksig; + + if (fatal_signal_pending(current)) + set_bit(IO_WQ_BIT_EXIT, &wq->state); + else + get_signal(&ksig); + continue; + } } while (!test_bit(IO_WQ_BIT_EXIT, &wq->state)); io_wq_check_workers(wq); diff --git a/fs/io_uring.c b/fs/io_uring.c index 54ea561db4a5..3a9d021db328 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -6765,8 +6765,14 @@ static int io_sq_thread(void *data) timeout = jiffies + sqd->sq_thread_idle; continue; } - if (fatal_signal_pending(current)) - break; + if (signal_pending(current)) { + struct ksignal ksig; + + if (fatal_signal_pending(current)) + break; + get_signal(&ksig); + continue; + } sqt_spin = false; cap_entries = !list_is_singular(&sqd->ctx_list); list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) { diff --git a/kernel/fork.c b/kernel/fork.c index d3171e8e88e5..3b45d0f04044 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2436,6 +2436,7 @@ struct task_struct *create_io_thread(int (*fn)(void *), void *arg, int node) if (!IS_ERR(tsk)) { sigfillset(&tsk->blocked); sigdelsetmask(&tsk->blocked, sigmask(SIGKILL)); + sigdelsetmask(&tsk->blocked, sigmask(SIGSTOP)); } return tsk; } diff --git a/kernel/ptrace.c b/kernel/ptrace.c index 821cf1723814..61db50f7ca86 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -375,7 +375,7 @@ static int ptrace_attach(struct task_struct *task, long request, audit_ptrace(task); retval = -EPERM; - if (unlikely(task->flags & (PF_KTHREAD | PF_IO_WORKER))) + if (unlikely(task->flags & PF_KTHREAD)) goto out; if (same_thread_group(task, current)) goto out; diff --git a/kernel/signal.c b/kernel/signal.c index f2a1b898da29..a5700557eb50 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -91,7 +91,7 @@ static bool sig_task_ignored(struct task_struct *t, int sig, bool force) return true; /* Only allow kernel generated signals to this kthread */ - if (unlikely((t->flags & (PF_KTHREAD | PF_IO_WORKER)) && + if (unlikely((t->flags & PF_KTHREAD) && (handler == SIG_KTHREAD_KERNEL) && !force)) return true; @@ -288,8 +288,7 @@ bool task_set_jobctl_pending(struct task_struct *task, unsigned long mask) JOBCTL_STOP_SIGMASK | JOBCTL_TRAPPING)); BUG_ON((mask & JOBCTL_TRAPPING) && !(mask & JOBCTL_PENDING_MASK)); - if (unlikely(fatal_signal_pending(task) || - (task->flags & (PF_EXITING | PF_IO_WORKER)))) + if (unlikely(fatal_signal_pending(task) || task->flags & PF_EXITING)) return false; if (mask & JOBCTL_STOP_SIGMASK) @@ -834,9 +833,6 @@ static int check_kill_permission(int sig, struct kernel_siginfo *info, if (!valid_signal(sig)) return -EINVAL; - /* PF_IO_WORKER threads don't take any signals */ - if (t->flags & PF_IO_WORKER) - return -ESRCH; if (!si_fromuser(info)) return 0; -- Jens Axboe