> On Mar 13, 2018, at 4:35 PM, Kalderon, Michal <Michal.Kalderon@xxxxxxxxxx> wrote: > >> From: Chuck Lever [mailto:chuck.lever@xxxxxxxxxx] >> Sent: Tuesday, March 13, 2018 5:27 PM >>>> Hi Chuck, the PD that is not freed here by rpcrdma is freed if we issue a >> umount. >>>> >>>> Mount: this is the creation of the pd: >>>> [ 1162.401116] ? rpcrdma_create_id+0x20b/0x270 [rpcrdma] [ >>>> 1162.401124] rpcrdma_ia_open+0x40/0xe0 [rpcrdma] [ 1162.401132] >>>> xprt_setup_rdma+0x110/0x3a0 [rpcrdma] [ 1162.401147] >>>> xprt_create_transport+0x7d/0x210 [sunrpc] [ 1162.401161] >>>> rpc_create+0xc5/0x1c0 [sunrpc] >>>> >>>> Umount: >>>> [ 1011.602701] qedr_dealloc_pd+0x18/0x90 [qedr] [ 1011.602709] >>>> ib_dealloc_pd+0x45/0x80 [ib_core] [ 1011.602716] >>>> rpcrdma_ia_close+0x57/0x70 [rpcrdma] [ 1011.602719] >>>> xprt_rdma_destroy+0x4d/0xb0 [rpcrdma] >>> >>> That is by design. Whether that design is correct or not remains to be seen. >>> >>> It wasn't clear to me that deallocating the PD on device removal was >>> necessary. At least the ID has to stay around until the core removes it. >>> >>> No-one complained about the missing ib_dealloc_pd during review. >>> >>> And, since I was able to unload the device driver with the current >>> design, I thought my assumption about leaving the PD was correct. >>> Under normal circumstances, with the current kernel, this is still the >>> case, and I don't see restracker warnings unless the transport is in some >> pathological state. >>> >>> >>>> Why not call rpcrdma_ia_close from rpcrdma_ia_remove >>> >>> rpcrdma_ia_close also destroys the ID. >>> >>> I suppose that since the actual work of tearing things down is done in >>> another thread, it would be safe for xprtrdma to destroy the ID >>> itself, rather than having the core do it once the upcall returns. In >>> at least one of the prototypes, the tear-down was done in the upcall >>> thread, so the ID had to be left alone. That aspect of the design has >>> stayed in the code--perhaps unnecessarily? >> >> I take that back: the core is holding a mutex during the upcall, so calling >> rdma_destroy_id will likely deadlock no matter what thread is calling. >> >> The most back-portable approach might be to dealloc the PD in >> rpcrdma_ia_remove. rpcrdma_ia_close and rpcrdma_ia_remove can then be >> de-duplicated in a subsequent patch. >> >> 447 ib_free_cq(ep->rep_attr.recv_cq); >> 448 ib_free_cq(ep->rep_attr.send_cq); >> +++ ib_dealloc_pd(ia->ri_pd); >> 449 > > Hi Chuck, thanks for the suggestion. However, this lead to another issue > A warning in dealloc_pd since pd->use_cnt isn't zero ( I guess decremented by rdma-destroy-id)? > [ 126.768366] ib_srpt srpt_remove_one(qedr0): nothing to do. > [ 126.768938] rpcrdma: removing device qedr0 for 192.168.10.57:20049 > [ 126.769240] WARNING: CPU: 6 PID: 550 at drivers/infiniband/core/verbs.c:317 ib_dealloc_pd+0x6b/0x80 [ib_core] > > What is supposed to lead to destroying the rdma-cm- ID at any stage? I understand the race you're describing, > But not clear on what's supposed to indicate to rdma-cm and when that this id should be freed? The core is supposed to free that ID when the ULP connect upcall returns non-zero. How about moving the ib_dealloc_pd() to after rpcrdma_mrs_destroy() ? 462 rpcrdma_dma_unmap_regbuf(req->rl_recvbuf); 463 } 464 rpcrdma_mrs_destroy(buf); +++ ib_dealloc_pd(ia->ri_pd); 465 > I know that all user context rdma-cm-id are freed when an application exists, but thought ULPs are responsible > For freeing their resources. > > Thanks, > Michal > > >> >> Fixes: bebd03186 ("xprtrdma: Support unplugging an HCA from under an NFS >> mount") >> >> Can you give that a try? >> >> >>> Advice on this is welcome! >>> >>> >>>> Thanks, >>>> Michal >>>> >>>>> >>>>>> >>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Michal >>>>>>>>>> >>>>>>>>>> GAD17990 login: [ 300.480137] ib_srpt srpt_remove_one(qedr0): >>>>>>>>>> nothing >>>>>>>> to do. >>>>>>>>>> [ 300.515527] ib_srpt srpt_remove_one(qedr1): nothing to do. >>>>>>>>>> [ 300.542182] rpcrdma: removing device qedr1 for >>>>>>>>>> 192.168.110.146:20049 [ 300.573789] WARNING: CPU: 12 PID: 3545 >>>>>>>>>> at >>>>>>>>>> drivers/infiniband/core/restrack.c:20 >>>>>>>>>> rdma_restrack_clean+0x25/0x30 [ib_core] [ 300.625985] Modules >>>>>>>>>> linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache >>>>>>>>>> rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi >>>>>>>>>> scsi_transport_iscsi ib_srpt target_core_mod ib_srp >>>>>>>>>> scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad >>>>>>>>>> rdma_cm ib_cm iw_cm 8021q >>>>> garp >>>>>>>>>> mrp qedr(-) ib_core xt_CHECKSUM iptable_mangle >>>>> ipt_MASQUERADE >>>>>>>>>> nf_nat_masquerade_ipv4 >>>>>> iptable_nat >>>>>>>>>> nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 >>>>>>>>>> xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge >>>>>>>>>> stp llc ebtable_filter ebtables fuse ip6table_filter ip6_tables >>>>>>>>>> iptable_filter dm_mirror dm_region_hash dm_log dm_mod vfat >> fat >>>>>>>>>> dax intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp >>>>>> coretemp >>>>>>>>>> kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul >>>>>>>>>> ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper >>>>>>>>>> cryptd ipmi_si [ 300.972993] iTCO_wdt ipmi_devintf sg pcspkr >>>>>>>> iTCO_vendor_support hpwdt hpilo lpc_ich ipmi_msghandler >>>>> pcc_cpufreq >>>>>>>> ioatdma i2c_i801 mfd_core wmi shpchp dca acpi_power_meter >>>>>>>> i2c_core nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables >>>>>>>> xfs libcrc32c sd_mod qede qed crc32c_intel tg3 hpsa >>>>>>>> scsi_transport_sas crc8 [ >>>>>> 301.109036] CPU: 12 PID: >>>>>>>> 3545 Comm: rmmod Not tainted 4.16.0-rc1 #1 [ 301.139518] >>>>>>>> Hardware >>>>>> name: >>>>>>>> HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 02/17/2017 [ >>>>>>>> 301.180411] RIP: 0010:rdma_restrack_clean+0x25/0x30 [ib_core] [ >>>>>>>> 301.208350] RSP: 0018:ffffb1820478fe88 EFLAGS: 00010286 [ >>>>>>>> 301.233241] >>>>>>>> RAX: 0000000000000000 RBX: ffffa099ed1b4070 RCX: >> ffffdf02a193c800 >>>>>>>> [ 301.268001] RDX: ffffa095ed12d7a0 RSI: 0000000000025900 RDI: >>>>>>>> ffffa099ed1b47d0 [ 301.302530] RBP: ffffa099ed1b4070 R08: >>>>>>>> ffffa095de9dd000 R09: 0000000180080007 [ 301.337245] R10: >>>>>>>> 0000000000000001 R11: ffffa095de9dd000 R12: ffffa099ed1b4000 [ >>>>>>>> 301.372151] R13: ffffa099ed1b405c R14: 0000000000e231c0 R15: >>>>>>>> 0000000000e23010 [ 301.407384] FS: 00007f2b0c854740(0000) >>>>>>>> GS:ffffa099ff700000(0000) knlGS:0000000000000000 [ 301.447026] >> CS: >>>>>>>> 0010 >>>>>>>> DS: 0000 ES: 0000 CR0: 0000000080050033 [ 301.475409] CR2: >>>>>>>> 0000000000e2caf8 CR3: 0000000865c0d006 CR4: 00000000001606e0 [ >>>>>>>> 301.510892] Call Trace: >>>>>>>>>> [ 301.522715] ib_unregister_device+0xf5/0x190 [ib_core] [ >>>>>>>>>> 301.547966] qedr_remove+0x37/0x60 [qedr] [ 301.568393] >>>>>>>>>> qede_rdma_unregister_driver+0x4b/0x90 [qede] [ 301.594980] >>>>>>>>>> SyS_delete_module+0x168/0x240 [ 301.615057] >>>>>>>>>> do_syscall_64+0x6f/0x1a0 [ 301.633588] >>>>>>>>>> entry_SYSCALL_64_after_hwframe+0x21/0x86 >>>>>>>>>> [ 301.658657] RIP: 0033:0x7f2b0bd33707 [ 301.676005] RSP: >>>>>>>>>> 002b:00007ffdefa29d98 EFLAGS: 00000202 ORIG_RAX: >>>>>> 00000000000000b0 >>>>>>>> [ >>>>>>>>>> 301.713324] RAX: ffffffffffffffda RBX: 0000000000e231c0 RCX: >>>>>>>>>> 00007f2b0bd33707 [ 301.748186] RDX: 00007f2b0bda3a80 RSI: >>>>>>>>>> 0000000000000800 RDI: 0000000000e23228 [ 301.782960] RBP: >>>>>>>>>> 0000000000000000 R08: 00007f2b0bff8060 R09: 00007f2b0bda3a80 >> [ >>>>>>>>>> 301.818142] R10: 00007ffdefa29b20 R11: 0000000000000202 R12: >>>>>>>>>> 00007ffdefa2b70d [ 301.853290] R13: 0000000000000000 R14: >>>>>>>>>> 0000000000e231c0 R15: 0000000000e23010 [ 301.888138] Code: 84 >>>>>>>>>> 00 >>>>>>>>>> 00 >>>>>>>>>> 00 00 00 0f 1f 44 00 00 48 83 c7 28 31 c0 eb 0c 48 83 c0 08 48 >>>>>>>>>> 3d >>>>>>>>>> 00 >>>>>>>>>> 08 00 00 74 0f 48 8d 14 07 48 8b 12 48 85 d2 74 e8 <0f> ff c3 >>>>>>>>>> f3 >>>>>>>>>> c3 >>>>>>>>>> 66 0f 1f 44 00 00 0f 1f 44 00 00 53 48 8b 47 28 [ 301.981140] >>>>>>>>>> ---[ end trace 28dec8f15205789a ]--- >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Chuck Lever >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> -- >>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" >>>>>>> in the body of a message to majordomo@xxxxxxxxxxxxxxx More >>>>>> majordomo >>>>>>> info at http://vger.kernel.org/majordomo-info.html >>>>>> >>>>>> -- >>>>>> Chuck Lever >>>>>> >>>>>> >>>>> >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe >>>>> linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" >>>> in the body of a message to majordomo@xxxxxxxxxxxxxxx More >> majordomo >>>> info at http://vger.kernel.org/majordomo-info.html >>> >>> -- >>> Chuck Lever >>> >>> >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" >>> in the body of a message to majordomo@xxxxxxxxxxxxxxx More >> majordomo >>> info at http://vger.kernel.org/majordomo-info.html >> >> -- >> Chuck Lever >> >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html