Re: rdma resource warning on 4.16-rc1 when unloading qedr after NFS mount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Mar 13, 2018, at 4:35 PM, Kalderon, Michal <Michal.Kalderon@xxxxxxxxxx> wrote:
> 
>> From: Chuck Lever [mailto:chuck.lever@xxxxxxxxxx]
>> Sent: Tuesday, March 13, 2018 5:27 PM
>>>> Hi Chuck, the PD that is not freed here by rpcrdma is freed if we issue a
>> umount.
>>>> 
>>>> Mount: this is the creation of the pd:
>>>> [ 1162.401116]  ? rpcrdma_create_id+0x20b/0x270 [rpcrdma] [
>>>> 1162.401124]  rpcrdma_ia_open+0x40/0xe0 [rpcrdma] [ 1162.401132]
>>>> xprt_setup_rdma+0x110/0x3a0 [rpcrdma] [ 1162.401147]
>>>> xprt_create_transport+0x7d/0x210 [sunrpc] [ 1162.401161]
>>>> rpc_create+0xc5/0x1c0 [sunrpc]
>>>> 
>>>> Umount:
>>>> [ 1011.602701]  qedr_dealloc_pd+0x18/0x90 [qedr] [ 1011.602709]
>>>> ib_dealloc_pd+0x45/0x80 [ib_core] [ 1011.602716]
>>>> rpcrdma_ia_close+0x57/0x70 [rpcrdma] [ 1011.602719]
>>>> xprt_rdma_destroy+0x4d/0xb0 [rpcrdma]
>>> 
>>> That is by design. Whether that design is correct or not remains to be seen.
>>> 
>>> It wasn't clear to me that deallocating the PD on device removal was
>>> necessary. At least the ID has to stay around until the core removes it.
>>> 
>>> No-one complained about the missing ib_dealloc_pd during review.
>>> 
>>> And, since I was able to unload the device driver with the current
>>> design, I thought my assumption about leaving the PD was correct.
>>> Under normal circumstances, with the current kernel, this is still the
>>> case, and I don't see restracker warnings unless the transport is in some
>> pathological state.
>>> 
>>> 
>>>> Why not call rpcrdma_ia_close from rpcrdma_ia_remove
>>> 
>>> rpcrdma_ia_close also destroys the ID.
>>> 
>>> I suppose that since the actual work of tearing things down is done in
>>> another thread, it would be safe for xprtrdma to destroy the ID
>>> itself, rather than having the core do it once the upcall returns. In
>>> at least one of the prototypes, the tear-down was done in the upcall
>>> thread, so the ID had to be left alone. That aspect of the design has
>>> stayed in the code--perhaps unnecessarily?
>> 
>> I take that back: the core is holding a mutex during the upcall, so calling
>> rdma_destroy_id will likely deadlock no matter what thread is calling.
>> 
>> The most back-portable approach might be to dealloc the PD in
>> rpcrdma_ia_remove. rpcrdma_ia_close and rpcrdma_ia_remove can then be
>> de-duplicated in a subsequent patch.
>> 
>> 447 	ib_free_cq(ep->rep_attr.recv_cq);
>> 448 	ib_free_cq(ep->rep_attr.send_cq);
>> +++ 	ib_dealloc_pd(ia->ri_pd);
>> 449
> 
> Hi Chuck, thanks for the suggestion. However, this lead to another issue
> A warning in dealloc_pd since pd->use_cnt isn't zero ( I guess decremented by rdma-destroy-id)? 
> [  126.768366] ib_srpt srpt_remove_one(qedr0): nothing to do.
> [  126.768938] rpcrdma: removing device qedr0 for 192.168.10.57:20049
> [  126.769240] WARNING: CPU: 6 PID: 550 at drivers/infiniband/core/verbs.c:317 ib_dealloc_pd+0x6b/0x80 [ib_core]
> 
> What is supposed to lead to destroying the rdma-cm- ID at any stage? I understand the race you're describing, 
> But not clear on what's supposed to indicate to rdma-cm and when that this id should be freed?

The core is supposed to free that ID when the ULP connect upcall returns
non-zero.

How about moving the ib_dealloc_pd() to after rpcrdma_mrs_destroy() ?

 462 		rpcrdma_dma_unmap_regbuf(req->rl_recvbuf);
 463 	}
 464 	rpcrdma_mrs_destroy(buf);
 +++ 	ib_dealloc_pd(ia->ri_pd);
 465 


> I know that all user context rdma-cm-id are freed when an application exists, but thought ULPs are responsible
> For freeing their resources. 
> 
> Thanks,
> Michal
> 
> 
>> 
>> Fixes: bebd03186 ("xprtrdma: Support unplugging an HCA from under an NFS
>> mount")
>> 
>> Can you give that a try?
>> 
>> 
>>> Advice on this is welcome!
>>> 
>>> 
>>>> Thanks,
>>>> Michal
>>>> 
>>>>> 
>>>>>> 
>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Michal
>>>>>>>>>> 
>>>>>>>>>> GAD17990 login: [  300.480137] ib_srpt srpt_remove_one(qedr0):
>>>>>>>>>> nothing
>>>>>>>> to do.
>>>>>>>>>> [  300.515527] ib_srpt srpt_remove_one(qedr1): nothing to do.
>>>>>>>>>> [  300.542182] rpcrdma: removing device qedr1 for
>>>>>>>>>> 192.168.110.146:20049 [  300.573789] WARNING: CPU: 12 PID: 3545
>>>>>>>>>> at
>>>>>>>>>> drivers/infiniband/core/restrack.c:20
>>>>>>>>>> rdma_restrack_clean+0x25/0x30 [ib_core] [  300.625985] Modules
>>>>>>>>>> linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache
>>>>>>>>>> rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi
>>>>>>>>>> scsi_transport_iscsi ib_srpt target_core_mod ib_srp
>>>>>>>>>> scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad
>>>>>>>>>> rdma_cm ib_cm iw_cm 8021q
>>>>> garp
>>>>>>>>>> mrp qedr(-) ib_core xt_CHECKSUM iptable_mangle
>>>>> ipt_MASQUERADE
>>>>>>>>>> nf_nat_masquerade_ipv4
>>>>>> iptable_nat
>>>>>>>>>> nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
>>>>>>>>>> xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge
>>>>>>>>>> stp llc ebtable_filter ebtables fuse ip6table_filter ip6_tables
>>>>>>>>>> iptable_filter dm_mirror dm_region_hash dm_log dm_mod vfat
>> fat
>>>>>>>>>> dax intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp
>>>>>> coretemp
>>>>>>>>>> kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
>>>>>>>>>> ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper
>>>>>>>>>> cryptd ipmi_si [  300.972993]  iTCO_wdt ipmi_devintf sg pcspkr
>>>>>>>> iTCO_vendor_support hpwdt hpilo lpc_ich ipmi_msghandler
>>>>> pcc_cpufreq
>>>>>>>> ioatdma i2c_i801 mfd_core wmi shpchp dca acpi_power_meter
>>>>>>>> i2c_core nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables
>>>>>>>> xfs libcrc32c sd_mod qede qed crc32c_intel tg3 hpsa
>>>>>>>> scsi_transport_sas crc8 [
>>>>>> 301.109036] CPU: 12 PID:
>>>>>>>> 3545 Comm: rmmod Not tainted 4.16.0-rc1 #1 [  301.139518]
>>>>>>>> Hardware
>>>>>> name:
>>>>>>>> HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 02/17/2017 [
>>>>>>>> 301.180411] RIP: 0010:rdma_restrack_clean+0x25/0x30 [ib_core] [
>>>>>>>> 301.208350] RSP: 0018:ffffb1820478fe88 EFLAGS: 00010286 [
>>>>>>>> 301.233241]
>>>>>>>> RAX: 0000000000000000 RBX: ffffa099ed1b4070 RCX:
>> ffffdf02a193c800
>>>>>>>> [ 301.268001] RDX: ffffa095ed12d7a0 RSI: 0000000000025900 RDI:
>>>>>>>> ffffa099ed1b47d0 [  301.302530] RBP: ffffa099ed1b4070 R08:
>>>>>>>> ffffa095de9dd000 R09: 0000000180080007 [  301.337245] R10:
>>>>>>>> 0000000000000001 R11: ffffa095de9dd000 R12: ffffa099ed1b4000 [
>>>>>>>> 301.372151] R13: ffffa099ed1b405c R14: 0000000000e231c0 R15:
>>>>>>>> 0000000000e23010 [  301.407384] FS:  00007f2b0c854740(0000)
>>>>>>>> GS:ffffa099ff700000(0000) knlGS:0000000000000000 [  301.447026]
>> CS:
>>>>>>>> 0010
>>>>>>>> DS: 0000 ES: 0000 CR0: 0000000080050033 [  301.475409] CR2:
>>>>>>>> 0000000000e2caf8 CR3: 0000000865c0d006 CR4: 00000000001606e0 [
>>>>>>>> 301.510892] Call Trace:
>>>>>>>>>> [  301.522715]  ib_unregister_device+0xf5/0x190 [ib_core] [
>>>>>>>>>> 301.547966]  qedr_remove+0x37/0x60 [qedr] [  301.568393]
>>>>>>>>>> qede_rdma_unregister_driver+0x4b/0x90 [qede] [  301.594980]
>>>>>>>>>> SyS_delete_module+0x168/0x240 [  301.615057]
>>>>>>>>>> do_syscall_64+0x6f/0x1a0 [  301.633588]
>>>>>>>>>> entry_SYSCALL_64_after_hwframe+0x21/0x86
>>>>>>>>>> [  301.658657] RIP: 0033:0x7f2b0bd33707 [  301.676005] RSP:
>>>>>>>>>> 002b:00007ffdefa29d98 EFLAGS: 00000202 ORIG_RAX:
>>>>>> 00000000000000b0
>>>>>>>> [
>>>>>>>>>> 301.713324] RAX: ffffffffffffffda RBX: 0000000000e231c0 RCX:
>>>>>>>>>> 00007f2b0bd33707 [  301.748186] RDX: 00007f2b0bda3a80 RSI:
>>>>>>>>>> 0000000000000800 RDI: 0000000000e23228 [  301.782960] RBP:
>>>>>>>>>> 0000000000000000 R08: 00007f2b0bff8060 R09: 00007f2b0bda3a80
>> [
>>>>>>>>>> 301.818142] R10: 00007ffdefa29b20 R11: 0000000000000202 R12:
>>>>>>>>>> 00007ffdefa2b70d [  301.853290] R13: 0000000000000000 R14:
>>>>>>>>>> 0000000000e231c0 R15: 0000000000e23010 [  301.888138] Code: 84
>>>>>>>>>> 00
>>>>>>>>>> 00
>>>>>>>>>> 00 00 00 0f 1f 44 00 00 48 83 c7 28 31 c0 eb 0c 48 83 c0 08 48
>>>>>>>>>> 3d
>>>>>>>>>> 00
>>>>>>>>>> 08 00 00 74 0f 48 8d 14 07 48 8b 12 48 85 d2 74 e8 <0f> ff c3
>>>>>>>>>> f3
>>>>>>>>>> c3
>>>>>>>>>> 66 0f 1f 44 00 00 0f 1f 44 00 00 53 48 8b 47 28 [  301.981140]
>>>>>>>>>> ---[ end trace 28dec8f15205789a ]---
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Chuck Lever
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
>>>>>>> in the body of a message to majordomo@xxxxxxxxxxxxxxx More
>>>>>> majordomo
>>>>>>> info at  http://vger.kernel.org/majordomo-info.html
>>>>>> 
>>>>>> --
>>>>>> Chuck Lever
>>>>>> 
>>>>>> 
>>>>> 
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe
>>>>> linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
>>>> in the body of a message to majordomo@xxxxxxxxxxxxxxx More
>> majordomo
>>>> info at  http://vger.kernel.org/majordomo-info.html
>>> 
>>> --
>>> Chuck Lever
>>> 
>>> 
>>> 
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
>>> in the body of a message to majordomo@xxxxxxxxxxxxxxx More
>> majordomo
>>> info at  http://vger.kernel.org/majordomo-info.html
>> 
>> --
>> Chuck Lever
>> 
>> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux