> From: Chuck Lever [mailto:chuck.lever@xxxxxxxxxx] > Sent: Tuesday, March 13, 2018 10:49 PM > > On Mar 13, 2018, at 4:35 PM, Kalderon, Michal > <Michal.Kalderon@xxxxxxxxxx> wrote: > > > > Hi Chuck, thanks for the suggestion. However, this lead to another > > issue A warning in dealloc_pd since pd->use_cnt isn't zero ( I guess > decremented by rdma-destroy-id)? > > [ 126.768366] ib_srpt srpt_remove_one(qedr0): nothing to do. > > [ 126.768938] rpcrdma: removing device qedr0 for 192.168.10.57:20049 > > [ 126.769240] WARNING: CPU: 6 PID: 550 at > > drivers/infiniband/core/verbs.c:317 ib_dealloc_pd+0x6b/0x80 [ib_core] > > > > What is supposed to lead to destroying the rdma-cm- ID at any stage? I > > understand the race you're describing, But not clear on what's supposed to > indicate to rdma-cm and when that this id should be freed? > > The core is supposed to free that ID when the ULP connect upcall returns > non-zero. > > How about moving the ib_dealloc_pd() to after rpcrdma_mrs_destroy() ? > > 462 rpcrdma_dma_unmap_regbuf(req->rl_recvbuf); > 463 } > 464 rpcrdma_mrs_destroy(buf); > +++ ib_dealloc_pd(ia->ri_pd); > 465 > > This solved the resource leak. However, I hit a NULL pointer dereference if I try rebooting Afterwards (regardless of the fix you suggested) mount -o rdma,port=20049 192.168.10.57:/tmp/nfs-server /tmp/nfs-client rmmod qedr reboot I hit a NULL pointer dereference listed below. This does not happen if I perform umount before rmmod qedr. And does not happen if I do reboot without rmmod qedr before Thanks, Michal [ 100.090800] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 [ 100.090872] IP: rpcrdma_ep_destroy+0x1d/0x60 [rpcrdma] [ 100.090902] PGD 0 P4D 0 [ 100.090922] Oops: 0000 [#1] SMP PTI [ 100.090800] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 [ 100.090944] Modules linked in: nfsv3 netconsole qede qed crc8 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bonding rpcrdma ib_isert iscsi_target_mod ib_iser ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core intel_powerclamp coretemp kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper cryptd iTCO_wdt gpio_ich iTCO_vendor_support ipmi_si pcspkr i2c_i801 lpc_ich ipmi_devintf sg ipmi_msghandler ioatdma shpchp [ 100.090872] IP: rpcrdma_ep_destroy+0x1d/0x60 [rpcrdma] [ 100.091335] i7core_edac acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 sr_mod sd_mod cdrom ata_generic pata_acpi mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix igb libata crc32c_intel e1000 dca i2c_algo_bit i2c_core [last unloaded: qedr] [ 100.091497] CPU: 2 PID: 62 Comm: kworker/2:1 Not tainted 4.16.0-rc5-kdump+ #21 [ 100.091537] Hardware name: Intel Corporation S5520HC/S5520HC, BIOS S5500.86B.01.00.0059.082320111421 08/23/2011 [ 100.090902] PGD 0 P4D 0 [ 100.091610] Workqueue: events xprt_destroy_cb [sunrpc] [ 100.091647] RIP: 0010:rpcrdma_ep_destroy+0x1d/0x60 [rpcrdma] [ 100.091680] RSP: 0018:ffffb929c1b17e70 EFLAGS: 00010282 [ 100.091711] RAX: 0000000000000000 RBX: ffff9ec41816a6c0 RCX: 0000000000000000 [ 100.091750] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9ec299d21d00 [ 100.091789] RBP: ffff9ec41816a528 R08: ffff9ec207800000 R09: ffff9ec299d21d00 [ 100.091828] R10: 0000000000000000 R11: 0000000000000040 R12: ffff9ec299ca5e00 [ 100.091867] R13: 0000000000000000 R14: ffff9ec206f2f840 R15: ffff9ec41816a3b0 [ 100.090922] Oops: 0000 [#1] SMP PTI [ 100.091906] FS: 0000000000000000(0000) GS:ffff9ec299c80000(0000) knlGS:0000000000000000 [ 100.091950] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 100.091983] CR2: 0000000000000010 CR3: 00000001d300a003 CR4: 00000000000206e0 [ 100.092022] Call Trace: -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html