On 11/8/21 10:44 PM, Ammar Faizi wrote: > On Sat, Nov 6, 2021 at 6:49 PM Ammar Faizi wrote: >> >> This is the reproducer for the kworker hang bug. >> >> Reproduction Steps: >> 1) A user task calls io_uring_queue_exit(). >> >> 2) Suspend the task with SIGSTOP / SIGTRAP before the ring exit is >> finished (do it as soon as step (1) is done). >> >> 3) Wait for `/proc/sys/kernel/hung_task_timeout_secs` seconds >> elapsed. >> >> 4) We get a complaint from the khungtaskd because the kworker is >> stuck in an uninterruptible state (D). >> >> The kworkers waiting on ring exit are not progressing as the task >> cannot proceed. When the user task is continued (e.g. get SIGCONT >> after SIGSTOP, or continue after SIGTRAP breakpoint), the kworkers >> then can finish the ring exit. >> >> We need a special handling for this case to avoid khungtaskd >> complaint. Currently we don't have the fix for this. > [...] >> Cc: Pavel Begunkov <asml.silence@xxxxxxxxx> >> Link: https://github.com/axboe/liburing/issues/448 >> Signed-off-by: Ammar Faizi <ammar.faizi@xxxxxxxxxxxxxxxxxxxxx> >> --- >> >> v6: >> - Fix missing call to restore_hung_entries() when fork() fails. >> >> .gitignore | 1 + >> test/Makefile | 1 + >> test/kworker-hang.c | 323 ++++++++++++++++++++++++++++++++++++++++++++ >> 3 files changed, 325 insertions(+) >> create mode 100644 test/kworker-hang.c > > It's ready for review. This one is still triggering in the current tree, I'd prefer waiting with queueing it up until it's fixed. I can park it in another branch until that happens. -- Jens Axboe