Re: oops in 5.4 on rdma

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Dec 10, 2019, at 12:34 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
> 
> Hi Chuck,
> 
> Is this known? Running your cel/testing from commit
> 37e235c0128566e9d97741ad1e546b44f324f108

The WARNING does not look familiar, but cel-testing has moved on.
Can you fetch it again?


> I started generic/013 and test hung for long time, got this but then
> test ran successfully.
> 
> [  153.452029] ------------[ cut here ]------------
> [  153.507281] WARNING: CPU: 14 PID: 975 at
> drivers/infiniband/core/cq.c:310 ib_free_cq_user+0xea/0x100 [ib_core]
> [  153.626988] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver
> nfs fscache rdma_rxe ip6_udp_tunnel udp_tunnel nfsd auth_rpcgss
> nfs_acl lockd grace xt_CHECKSUM xt_MASQUERADE tun bridge stp llc
> ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6
> xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute ip6table_nat
> ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_nat
> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle
> iptable_security iptable_raw ebtable_filter ebtables ip6table_filter
> ip6_tables iptable_filter ib_isert iscsi_target_mod ib_srpt
> target_core_mod ib_srp scsi_transport_srp rpcrdma sunrpc
> intel_rapl_msr intel_rapl_common rdma_ucm x86_pkg_temp_thermal ib_iser
> intel_powerclamp coretemp rdma_cm kvm_intel iw_cm ib_umad ib_ipoib
> libiscsi kvm scsi_transport_iscsi ib_cm irqbypass crct10dif_pclmul
> mlx5_ib crc32_pclmul iTCO_wdt ipmi_ssif ghash_clmulni_intel
> iTCO_vendor_support aesni_intel ib_uverbs crypto_simd ipmi_si cryptd
> ipmi_devintf pcspkr ib_core
> [  153.627026]  glue_helper i2c_i801 sg lpc_ich ipmi_msghandler wmi
> acpi_power_meter ip_tables xfs libcrc32c sd_mod mgag200 drm_kms_helper
> syscopyarea sysfillrect sysimgblt fb_sys_fops drm_vram_helper ttm isci
> mlx5_core libsas igb drm ahci qla2xxx libahci scsi_transport_sas
> libata dca crc32c_intel i2c_algo_bit i2c_core scsi_transport_fc
> pci_hyperv_intf dm_mirror dm_region_hash dm_log dm_mod
> [  155.086407] CPU: 14 PID: 975 Comm: kworker/u52:0 Not tainted 5.4.0+ #1
> [  155.164520] Hardware name: FUJITSU PRIMERGY RX200 S7/D3032-A1, BIOS
> V4.6.5.3 R2.29.0 for D3032-A1x 06/18/2018
> [  155.283237] Workqueue: xprtiod xprt_autoclose [sunrpc]
> [  155.344725] RIP: 0010:ib_free_cq_user+0xea/0x100 [ib_core]
> [  155.410365] Code: d7 48 8b 03 48 85 c0 75 e8 e9 6a ff ff ff 48 8d
> 7f 40 e8 89 9a 52 d6 e9 57 ff ff ff 48 8d 7f 40 e8 0b de 86 d6 e9 49
> ff ff ff <0f> 0b 5b 5d 41 5c c3 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00
> 00 00
> [  155.635114] RSP: 0018:ffff98e4c6aebda0 EFLAGS: 00010202
> [  155.697624] RAX: 0000000000000001 RBX: ffff8b85efdb8000 RCX: 0000000000000000
> [  155.783015] RDX: ffff8b861516ae80 RSI: 0000000000000000 RDI: ffff8b8df0087000
> [  155.868404] RBP: ffff8b8df0087000 R08: 0000000000000001 R09: 0000000000000000
> [  155.953795] R10: ffff8b8e1724b000 R11: ffffffffffffffa6 R12: ffff8b85efdb8000
> [  156.039186] R13: 0000000000000000 R14: ffff8b86071cb000 R15: ffff8b85efdb8448
> [  156.124577] FS:  0000000000000000(0000) GS:ffff8b861fa00000(0000)
> knlGS:0000000000000000
> [  156.221405] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  156.290157] CR2: 00007fde99d85000 CR3: 000000025f20a003 CR4: 00000000000606e0
> [  156.375548] Call Trace:
> [  156.404805]  rpcrdma_ep_destroy+0x43/0x70 [rpcrdma]
> [  156.463171]  rpcrdma_ep_disconnect+0xf2/0x1c0 [rpcrdma]
> [  156.525683]  ? __switch_to_asm+0x34/0x70
> [  156.572589]  ? __switch_to_asm+0x40/0x70
> [  156.619500]  ? __switch_to_asm+0x34/0x70
> [  156.666409]  ? __switch_to_asm+0x40/0x70
> [  156.713321]  ? __switch_to_asm+0x34/0x70
> [  156.760238]  xprt_rdma_close+0x49/0xc0 [rpcrdma]
> [  156.815481]  xprt_autoclose+0x50/0xb0 [sunrpc]
> [  156.868635]  process_one_work+0x171/0x380
> [  156.916584]  worker_thread+0x49/0x3f0
> [  156.960375]  kthread+0xf8/0x130
> [  156.997926]  ? max_active_store+0x80/0x80
> [  157.045875]  ? kthread_bind+0x10/0x10
> [  157.089665]  ret_from_fork+0x35/0x40
> [  157.132416] ---[ end trace dcd41693526c20ae ]---
> 
> Var log also had this:
> Dec 10 12:37:03 localhost kolga: run xfstest generic/013
> Dec 10 12:37:03 localhost journal: run fstests generic/013 at
> 2019-12-10 12:37:03
> Dec 10 12:39:54 localhost kernel: INFO: task kworker/6:2:295 blocked
> for more than 122 seconds.
> Dec 10 12:39:54 localhost kernel:      Tainted: G        W         5.4.0+ #1
> Dec 10 12:39:54 localhost kernel: "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Dec 10 12:39:55 localhost kernel: kworker/6:2     D    0   295      2 0x80004000
> Dec 10 12:39:55 localhost kernel: Workqueue: events xprt_destroy_cb [sunrpc]
> Dec 10 12:39:55 localhost kernel: Call Trace:
> Dec 10 12:39:55 localhost kernel: ? __schedule+0x2d1/0x6c0
> Dec 10 12:39:55 localhost kernel: schedule+0x39/0xa0
> Dec 10 12:39:55 localhost kernel: schedule_timeout+0x1c8/0x290
> Dec 10 12:39:55 localhost kernel: ? tracing_is_on+0x11/0x30
> Dec 10 12:39:55 localhost kernel: ? trace_save_cmdline+0x68/0xd0
> Dec 10 12:39:55 localhost kernel: wait_for_completion+0x123/0x190
> Dec 10 12:39:55 localhost kernel: ? wake_up_q+0x70/0x70
> Dec 10 12:39:55 localhost kernel: __flush_work.isra.35+0x11e/0x1a0
> Dec 10 12:39:55 localhost kernel: ? get_work_pool+0x40/0x40
> Dec 10 12:39:55 localhost kernel: __cancel_work_timer+0x103/0x190
> Dec 10 12:39:55 localhost kernel: xprt_rdma_destroy+0x22/0xb0 [rpcrdma]
> Dec 10 12:39:55 localhost kernel: process_one_work+0x171/0x380
> Dec 10 12:39:55 localhost kernel: worker_thread+0x49/0x3f0
> Dec 10 12:39:55 localhost kernel: kthread+0xf8/0x130
> Dec 10 12:39:55 localhost kernel: ? max_active_store+0x80/0x80
> Dec 10 12:39:55 localhost kernel: ? kthread_bind+0x10/0x10
> Dec 10 12:39:55 localhost kernel: ret_from_fork+0x35/0x40

--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux