RE: rdma resource warning on 4.16-rc1 when unloading qedr after NFS mount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> From: Chuck Lever [mailto:chuck.lever@xxxxxxxxxx]
> Sent: Tuesday, March 13, 2018 10:49 PM
> > On Mar 13, 2018, at 4:35 PM, Kalderon, Michal
> <Michal.Kalderon@xxxxxxxxxx> wrote:
> >

> > Hi Chuck, thanks for the suggestion. However, this lead to another
> > issue A warning in dealloc_pd since pd->use_cnt isn't zero ( I guess
> decremented by rdma-destroy-id)?
> > [  126.768366] ib_srpt srpt_remove_one(qedr0): nothing to do.
> > [  126.768938] rpcrdma: removing device qedr0 for 192.168.10.57:20049
> > [  126.769240] WARNING: CPU: 6 PID: 550 at
> > drivers/infiniband/core/verbs.c:317 ib_dealloc_pd+0x6b/0x80 [ib_core]
> >
> > What is supposed to lead to destroying the rdma-cm- ID at any stage? I
> > understand the race you're describing, But not clear on what's supposed to
> indicate to rdma-cm and when that this id should be freed?
> 
> The core is supposed to free that ID when the ULP connect upcall returns
> non-zero.
> 
> How about moving the ib_dealloc_pd() to after rpcrdma_mrs_destroy() ?
> 
>  462 		rpcrdma_dma_unmap_regbuf(req->rl_recvbuf);
>  463 	}
>  464 	rpcrdma_mrs_destroy(buf);
>  +++ 	ib_dealloc_pd(ia->ri_pd);
>  465
> 
> 
This solved the resource leak. However, I hit a NULL pointer dereference if I try rebooting
Afterwards (regardless of the fix you suggested)

mount -o rdma,port=20049 192.168.10.57:/tmp/nfs-server /tmp/nfs-client
rmmod qedr
reboot

I hit a NULL pointer dereference listed below. 

This does not happen if I perform umount before rmmod qedr. And does not happen if I do reboot without rmmod qedr before

Thanks,
Michal

[  100.090800] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[  100.090872] IP: rpcrdma_ep_destroy+0x1d/0x60 [rpcrdma]
[  100.090902] PGD 0 P4D 0 
[  100.090922] Oops: 0000 [#1] SMP PTI
[  100.090800] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[  100.090944] Modules linked in: nfsv3 netconsole qede qed crc8 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bonding rpcrdma ib_isert iscsi_target_mod ib_iser ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core intel_powerclamp coretemp kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper cryptd iTCO_wdt gpio_ich iTCO_vendor_support ipmi_si pcspkr i2c_i801 lpc_ich ipmi_devintf sg ipmi_msghandler ioatdma shpchp
[  100.090872] IP: rpcrdma_ep_destroy+0x1d/0x60 [rpcrdma]
[  100.091335]  i7core_edac acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 sr_mod sd_mod cdrom ata_generic pata_acpi mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix igb libata crc32c_intel e1000 dca i2c_algo_bit i2c_core [last unloaded: qedr]
[  100.091497] CPU: 2 PID: 62 Comm: kworker/2:1 Not tainted 4.16.0-rc5-kdump+ #21
[  100.091537] Hardware name: Intel Corporation S5520HC/S5520HC, BIOS S5500.86B.01.00.0059.082320111421 08/23/2011
[  100.090902] PGD 0 P4D 0 
[  100.091610] Workqueue: events xprt_destroy_cb [sunrpc]
[  100.091647] RIP: 0010:rpcrdma_ep_destroy+0x1d/0x60 [rpcrdma]
[  100.091680] RSP: 0018:ffffb929c1b17e70 EFLAGS: 00010282
[  100.091711] RAX: 0000000000000000 RBX: ffff9ec41816a6c0 RCX: 0000000000000000
[  100.091750] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9ec299d21d00
[  100.091789] RBP: ffff9ec41816a528 R08: ffff9ec207800000 R09: ffff9ec299d21d00
[  100.091828] R10: 0000000000000000 R11: 0000000000000040 R12: ffff9ec299ca5e00
[  100.091867] R13: 0000000000000000 R14: ffff9ec206f2f840 R15: ffff9ec41816a3b0
[  100.090922] Oops: 0000 [#1] SMP PTI
[  100.091906] FS:  0000000000000000(0000) GS:ffff9ec299c80000(0000) knlGS:0000000000000000
[  100.091950] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  100.091983] CR2: 0000000000000010 CR3: 00000001d300a003 CR4: 00000000000206e0
[  100.092022] Call Trace:

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux