On 2/15/23 3:10 PM, John David Anglin wrote: > On 2023-02-15 4:39 p.m., John David Anglin wrote: >> On 2023-02-15 4:06 p.m., John David Anglin wrote: >>> On 2023-02-15 3:37 p.m., Jens Axboe wrote: >>>>> System crashes running test buf-ring.t. >>>> Huh, what's the crash? >>> Not much info. System log indicates an HPMC occurred. Unfortunately, recovery code doesn't work. >> The following occurred running buf-ring.t under gdb: >> >> INFO: task kworker/u64:9:18319 blocked for more than 123 seconds. >> Not tainted 6.1.12+ #4 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> task:kworker/u64:9 state:D stack:0 pid:18319 ppid:2 flags:0x00000000 >> Workqueue: events_unbound io_ring_exit_work >> Backtrace: >> [<0000000040b5c210>] __schedule+0x2e8/0x7f0 >> [<0000000040b5c7d0>] schedule+0xb8/0x1d0 >> [<0000000040b66534>] schedule_timeout+0x11c/0x1b0 >> [<0000000040b5d71c>] __wait_for_common+0x194/0x2e8 >> [<0000000040b5d8ac>] wait_for_completion+0x3c/0x50 >> [<0000000040b46508>] io_ring_exit_work+0x3d8/0x4d0 >> [<0000000040268da8>] process_one_work+0x238/0x520 >> [<00000000402692a4>] worker_thread+0x214/0x778 >> [<0000000040276f94>] kthread+0x24c/0x258 >> [<0000000040202020>] ret_from_kernel_thread+0x20/0x28 >> >> INFO: task kworker/u64:10:18320 blocked for more than 123 seconds. >> Not tainted 6.1.12+ #4 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> task:kworker/u64:10 state:D stack:0 pid:18320 ppid:2 flags:0x00000000 >> Workqueue: events_unbound io_ring_exit_work >> Backtrace: >> [<0000000040b5c210>] __schedule+0x2e8/0x7f0 >> [<0000000040b5c7d0>] schedule+0xb8/0x1d0 >> [<0000000040b66534>] schedule_timeout+0x11c/0x1b0 >> [<0000000040b5d71c>] __wait_for_common+0x194/0x2e8 >> [<0000000040b5d8ac>] wait_for_completion+0x3c/0x50 >> [<0000000040b46508>] io_ring_exit_work+0x3d8/0x4d0 >> [<0000000040268da8>] process_one_work+0x238/0x520 >> [<00000000402692a4>] worker_thread+0x214/0x778 >> [<0000000040276f94>] kthread+0x24c/0x258 >> [<0000000040202020>] ret_from_kernel_thread+0x20/0x28 >> >> gdb was sitting at a break at line 328. > With Helge's latest patch, we get a software lockup: > > TCP: request_sock_TCP: Possible SYN flooding on port 31309. Sending cookies. Check SNMP counters. > watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/u64:13:14621] > Modules linked in: binfmt_misc ext4 crc16 jbd2 ext2 mbcache sg ipmi_watchdog ipmi_si ipmi_poweroff ipmi_devintf ipmi_msghandler fuse nfsd ip_tables x_tables ipv6 autofs4 xfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod t10_pi ses enclosure scsi_transport_sas crc64_rocksoft crc64 sr_mod uas usb_storage cdrom ohci_pci ehci_pci ohci_hcd pata_cmd64x ehci_hcd sym53c8xx libata scsi_transport_spi usbcore tg3 scsi_mod scsi_common usb_common > CPU: 0 PID: 14621 Comm: kworker/u64:13 Not tainted 6.1.12+ #5 > Hardware name: 9000/800/rp3440 > Workqueue: events_unbound io_ring_exit_work This is not related to Helge's patch, 6.1-stable is just still missing: commit fcc926bb857949dbfa51a7d95f3f5ebc657f198c Author: Jens Axboe <axboe@xxxxxxxxx> Date: Fri Jan 27 09:28:13 2023 -0700 io_uring: add a conditional reschedule to the IOPOLL cancelation loop and I'm guessing you're running without preempt. -- Jens Axboe