On Tue, Nov 9, 2021 at 9:47 PM Jens Axboe wrote: > On 11/8/21 10:44 PM, Ammar Faizi wrote: >> On Sat, Nov 6, 2021 at 6:49 PM Ammar Faizi wrote: >>> >>> This is the reproducer for the kworker hang bug. >>> >>> Reproduction Steps: >>> 1) A user task calls io_uring_queue_exit(). >>> >>> 2) Suspend the task with SIGSTOP / SIGTRAP before the ring exit is >>> finished (do it as soon as step (1) is done). >>> >>> 3) Wait for `/proc/sys/kernel/hung_task_timeout_secs` seconds >>> elapsed. >>> >>> 4) We get a complaint from the khungtaskd because the kworker is >>> stuck in an uninterruptible state (D). >>> >>> The kworkers waiting on ring exit are not progressing as the task >>> cannot proceed. When the user task is continued (e.g. get SIGCONT >>> after SIGSTOP, or continue after SIGTRAP breakpoint), the kworkers >>> then can finish the ring exit. >>> >>> We need a special handling for this case to avoid khungtaskd >>> complaint. Currently we don't have the fix for this. >> [...] >>> Cc: Pavel Begunkov <asml.silence@xxxxxxxxx> >>> Link: https://github.com/axboe/liburing/issues/448 >>> Signed-off-by: Ammar Faizi <ammar.faizi@xxxxxxxxxxxxxxxxxxxxx> >>> --- >>> >>> v6: >>> - Fix missing call to restore_hung_entries() when fork() fails. >>> >>> .gitignore | 1 + >>> test/Makefile | 1 + >>> test/kworker-hang.c | 323 ++++++++++++++++++++++++++++++++++++++++++++ >>> 3 files changed, 325 insertions(+) >>> create mode 100644 test/kworker-hang.c >> >> It's ready for review. > > This one is still triggering in the current tree, I'd prefer waiting with > queueing it up until it's fixed. I can park it in another branch until > that happens. > Understand, thanks! -- Ammar Faizi