Re: [bug report] watchdog: BUG: soft lockup - CPU#19 stuck for 26s! [poll-cancel-all:33473]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/7/24 7:32 PM, Guangwu Zhang wrote:
> Hi,
> 
> Found the kernel error with linux-block/for-next  branch.
> kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
> last commit 69c95a16fe484e6de5f3cc616953cc31d67e7000
> Merge: d78544b06104 052618c71c66
> 
> reproducer : git://git.kernel.dk/liburing  poll-cancel-all.t
> 
> 
>  [ 1722.001827] Running test openat2.t:
> [ 1722.029366] Running test open-close.t:
> [ 1722.059368] Running test open-direct-link.t:
> [ 1722.090178] Running test open-direct-pick.t:
> [ 1722.122762] Running test personality.t:
> [ 1722.152628] Running test pipe-bug.t:
> [ 1722.510823] Running test pipe-eof.t:
> [ 1722.540346] Running test pipe-reuse.t:
> [ 1722.569774] Running test poll.t:
> [ 1722.643305] Running test poll-cancel.t:
> [ 1722.673582] Running test poll-cancel-all.t:
> [ 1732.010369] restraintd[1656]: *** Current Time: Wed Feb 07 10:20:49
> 2024  Localwatchdog at: Wed Feb 07 14:18:48 2024
> [ 1747.506670] watchdog: BUG: soft lockup - CPU#19 stuck for 26s!
> [poll-cancel-all:33473]
> [ 1747.514587] Modules linked in: tls rpcsec_gss_krb5 auth_rpcgss
> nfsv4 dns_resolver nfs lockd grace netfs rfkill sunrpc vfat fat
> dm_multipath intel_rapl_msr intel_rapl_common intel_uncore_frequency
> intel_uncore_frequency_common isst_if_common skx_edac nfit libnvdimm
> x86_pkg_temp_thermal intel_powerclamp coretemp ipmi_ssif kvm_intel kvm
> irqbypass acpi_ipmi rapl mgag200 ipmi_si iTCO_wdt i2c_algo_bit
> intel_cstate drm_shmem_helper iTCO_vendor_support dcdbas dell_smbios
> mei_me i2c_i801 ipmi_devintf drm_kms_helper intel_uncore mei
> dell_wmi_descriptor intel_pch_thermal i2c_smbus wmi_bmof lpc_ich
> pcspkr ipmi_msghandler acpi_power_meter drm fuse xfs libcrc32c sd_mod
> sg ahci nvme crct10dif_pclmul libahci crc32_pclmul crc32c_intel
> nvme_core libata megaraid_sas tg3 ghash_clmulni_intel t10_pi wmi
> dm_mirror dm_region_hash dm_log dm_mod
> [ 1747.587107] CPU: 19 PID: 33473 Comm: poll-cancel-all Kdump: loaded
> Not tainted 6.8.0-rc3+ #1
> [ 1747.595540] Hardware name: Dell Inc. PowerEdge R640/06DKY5, BIOS
> 2.15.1 06/15/2022
> [ 1747.603104] RIP: 0010:__io_poll_cancel.isra.0+0xa3/0x170
> [ 1747.608419] Code: ef 60 49 89 ff 74 73 48 89 ee e8 a8 0e 00 00 84
> c0 74 e2 4c 89 24 24 f0 41 81 8f a0 00 00 00 00 00 00 80 41 8b 87 a0
> 00 00 00 <83> f8 7f 0f 8f 92 00 00 00 b8 01 00 00 00 f0 41 0f c1 87 a0
> 00 00
> [ 1747.627161] RSP: 0018:ffffa16d64997cc8 EFLAGS: 00000282
> [ 1747.632387] RAX: 0000000093f4d23b RBX: ffff940d4839bbc0 RCX: 0000000000000000
> [ 1747.639521] RDX: 0000000000000001 RSI: ffffa16d64997da8 RDI: ffff940ee22c2e00
> [ 1747.646653] RBP: ffffa16d64997da8 R08: 0000000000000001 R09: ffff940d4c12b000
> [ 1747.653785] R10: 0000000000000008 R11: 0000000000002cc0 R12: ffff940ec15ed900
> [ 1747.660919] R13: 0000000000000000 R14: 0000000000000002 R15: ffff940ee22c2e00
> [ 1747.668052] FS:  00007f0758c28740(0000) GS:ffff9410bfa40000(0000)
> knlGS:0000000000000000
> [ 1747.676135] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1747.681882] CR2: 00000000004052b0 CR3: 00000002a2542004 CR4: 00000000007706f0
> [ 1747.689013] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 1747.696147] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 1747.703278] PKRU: 55555554
> [ 1747.705983] Call Trace:
> [ 1747.708436]  <IRQ>
> [ 1747.710456]  ? watchdog_timer_fn+0x1ec/0x270
> [ 1747.714726]  ? __pfx_watchdog_timer_fn+0x10/0x10
> [ 1747.719347]  ? __hrtimer_run_queues+0x10f/0x2b0
> [ 1747.723880]  ? hrtimer_interrupt+0xfc/0x230
> [ 1747.728066]  ? __sysvec_apic_timer_interrupt+0x4b/0x140
> [ 1747.733289]  ? sysvec_apic_timer_interrupt+0x6d/0x90
> [ 1747.738257]  </IRQ>
> [ 1747.740363]  <TASK>
> [ 1747.742468]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
> [ 1747.747783]  ? __io_poll_cancel.isra.0+0xa3/0x170
> [ 1747.752487]  ? __io_poll_cancel.isra.0+0x88/0x170
> [ 1747.757194]  io_poll_cancel+0x24/0x80
> [ 1747.760859]  io_try_cancel+0x86/0x100
> [ 1747.764525]  __io_async_cancel+0x41/0xf0
> [ 1747.768451]  ? fget+0x7a/0xc0
> [ 1747.771423]  io_async_cancel+0xa5/0x110
> [ 1747.775261]  io_issue_sqe+0x5b/0x3f0
> [ 1747.778842]  io_submit_sqes+0x126/0x3d0
> [ 1747.782680]  __do_sys_io_uring_enter+0x2c8/0x480
> [ 1747.787301]  do_syscall_64+0x7f/0x160
> [ 1747.790965]  ? do_user_addr_fault+0x31f/0x690
> [ 1747.795325]  ? exc_page_fault+0x65/0x150
> [ 1747.799249]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
> [ 1747.804302] RIP: 0033:0x40402e
> [ 1747.807361] Code: 41 89 ca 8b ba cc 00 00 00 41 b9 08 00 00 00 b8
> aa 01 00 00 41 83 ca 10 f6 82 d0 00 00 00 01 44 0f 44 d1 45 31 c0 31
> d2 0f 05 <c3> 90 89 30 eb 99 0f 1f 40 00 8b 3f 45 31 c0 83 e7 06 41 0f
> 95 c0
> [ 1747.826106] RSP: 002b:00007fff5139edd8 EFLAGS: 00000246 ORIG_RAX:
> 00000000000001aa
> [ 1747.833671] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000040402e
> [ 1747.840805] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000005
> [ 1747.847937] RBP: 00007fff5139ee40 R08: 0000000000000000 R09: 0000000000000008

Known issue in that sha, it's fixed in the current tree.

-- 
Jens Axboe






[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux