On 3/25/21 3:57 PM, Stefan Metzmacher wrote: > > Am 25.03.21 um 22:44 schrieb Jens Axboe: >> On 3/25/21 2:40 PM, Jens Axboe wrote: >>> On 3/25/21 2:12 PM, Linus Torvalds wrote: >>>> On Thu, Mar 25, 2021 at 12:42 PM Linus Torvalds >>>> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: >>>>> >>>>> On Thu, Mar 25, 2021 at 12:38 PM Linus Torvalds >>>>> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: >>>>>> >>>>>> I don't know what the gdb logic is, but maybe there's some other >>>>>> option that makes gdb not react to them? >>>>> >>>>> .. maybe we could have a different name for them under the task/ >>>>> subdirectory, for example (not just the pid)? Although that probably >>>>> messes up 'ps' too.. >>>> >>>> Actually, maybe the right model is to simply make all the io threads >>>> take signals, and get rid of all the special cases. >>>> >>>> Sure, the signals will never be delivered to user space, but if we >>>> >>>> - just made the thread loop do "get_signal()" when there are pending signals >>>> >>>> - allowed ptrace_attach on them >>>> >>>> they'd look pretty much like regular threads that just never do the >>>> user-space part of signal handling. >>>> >>>> The whole "signals are very special for IO threads" thing has caused >>>> so many problems, that maybe the solution is simply to _not_ make them >>>> special? >>> >>> Just to wrap up the previous one, yes it broke all sorts of things to >>> make the 'tid' directory different. They just end up being hidden anyway >>> through that, for both ps and top. >>> >>> Yes, I do think that maybe it's better to just embrace maybe just >>> embrace the signals, and have everything just work by default. It's >>> better than continually trying to make the threads special. I'll see >>> if there are some demons lurking down that path. >> >> In the spirit of "let's just try it", I ran with the below patch. With >> that, I can gdb attach just fine to a test case that creates an io_uring >> and a regular thread with pthread_create(). The regular thread uses >> the ring, so you end up with two iou-mgr threads. Attach: >> >> [root@archlinux ~]# gdb -p 360 >> [snip gdb noise] >> Attaching to process 360 >> [New LWP 361] >> [New LWP 362] >> [New LWP 363] >> >> warning: Selected architecture i386:x86-64 is not compatible with reported target architecture i386 >> >> warning: Architecture rejected target-supplied description >> Error while reading shared library symbols for /usr/lib/libpthread.so.0: >> Cannot find user-level thread for LWP 363: generic error >> 0x00007f7aa526e125 in clock_nanosleep@GLIBC_2.2.5 () from /usr/lib/libc.so.6 >> (gdb) info threads >> Id Target Id Frame >> * 1 LWP 360 "io_uring" 0x00007f7aa526e125 in clock_nanosleep@GLIBC_2.2.5 () >> from /usr/lib/libc.so.6 >> 2 LWP 361 "iou-mgr-360" 0x0000000000000000 in ?? () >> 3 LWP 362 "io_uring" 0x00007f7aa52a0a9d in syscall () from /usr/lib/libc.so.6 >> 4 LWP 363 "iou-mgr-362" 0x0000000000000000 in ?? () >> (gdb) thread 2 >> [Switching to thread 2 (LWP 361)] >> #0 0x0000000000000000 in ?? () >> (gdb) bt >> #0 0x0000000000000000 in ?? () >> Backtrace stopped: Cannot access memory at address 0x0 >> (gdb) cont >> Continuing. >> ^C >> Thread 1 "io_uring" received signal SIGINT, Interrupt. >> [Switching to LWP 360] >> 0x00007f7aa526e125 in clock_nanosleep@GLIBC_2.2.5 () from /usr/lib/libc.so.6 >> (gdb) q >> A debugging session is active. >> >> Inferior 1 [process 360] will be detached. >> >> Quit anyway? (y or n) y >> Detaching from program: /root/git/fio/t/io_uring, process 360 >> [Inferior 1 (process 360) detached] >> >> The iou-mgr-x threads are stopped just fine, gdb obviously can't get any >> real info out of them. But it works... Regular test cases work fine too, >> just a sanity check. Didn't expect them not to. > > I guess that's basically what I tried to describe when I said they > should look like a userspace process that is blocked in a syscall > forever. Right, that's almost what they look like, in practice that is what they look like. >> Only thing that I dislike a bit, but I guess that's just a Linuxism, is >> that if can now kill an io_uring owning task by sending a signal to one >> of its IO thread workers. > > Can't we just only allow SIGSTOP, which will be only delivered to > the iothread itself? And also SIGKILL should not be allowed from userspace. I don't think we can sanely block them, and we to cleanup and teardown normally regardless of who gets the signal (owner or one of the threads). So I'm not _too_ hung up on the "io thread gets signal goes to owner" as that is what happens with normal threads too, though I would prefer if that wasn't the case. But overall I feel better just embracing the thread model, rather than having something that kinda sorta looks like a thread, but differs in odd ways. > And /proc/$iothread/ should be read only and owned by root with > "cmdline" and "exe" being empty. I know you brought this one up as part of your series, not sure I get why you want it owned by root and read-only? cmdline and exe, yeah those could be hidden, but is there really any point? Maybe I'm missing something here, if so, do clue me in! -- Jens Axboe