Thanks for the patch and sorry for the slow reply v1 vs v2: my take is that I think v1 is easier to understand, and if you pass a fd to be used as kernel end for 9p you shouldn't also be messing with it so it's fair game to make it O_NONBLOCK -- we're reading and writing to these things, the fds shouldn't be used by the caller after the mount syscall. Is there any reason you spent time working on v2, or is that just theorical for not messing with userland fd ? unless there's any reason I'll try to find time to test v1 and queue it for 6.1 Tetsuo Handa wrote on Fri, Sep 02, 2022 at 07:25:30AM +0900: > On 2022/09/02 0:23, Christian Schoenebeck wrote: > > So the intention in this alternative approach is to allow user space apps > > still being able to perform blocking I/O, while at the same time making the > > kernel thread interruptible to fix this hung task issue, correct? > > Making the kernel thread "non-blocking" (rather than "interruptible") in order > not to be blocked on I/O on pipes. > > Since kernel threads by default do not receive signals, being "interruptible" > or "killable" does not help (except for silencing khungtaskd warning). Being > "non-blocking" like I/O on sockets helps. I'm still not 100% sure I understand the root of the deadlock, but I can agree the worker thread shouldn't block. We seem to check for EAGAIN where kernel_read/write end up being called and there's a poll for scheduling so it -should- work, but I assume this hasn't been tested much and might take a bit of time to test. > The thread which currently clearing the TIF_SIGPENDING flag is a user process > (which are calling "killable" functions from syscall context but effectively > "uninterruptible" due to clearing the TIF_SIGPENDING flag and retrying). > By the way, clearing the TIF_SIGPENDING flag before retrying "killable" functions > (like p9_client_rpc() does) is very bad and needs to be avoided... Yes, I really wish we could make this go away. I started work to make the cancel (flush) asynchronous, but it introduced a regression I never had (and still don't have) time to figure out... If you have motivation to take over, the patches I sent are here: https://lore.kernel.org/all/20181217110111.GB17466@nautica/T/ (unfortunately some refactoring happened and they no longer apply, but the logic should be mostly sane) Thanks, -- Dominique