Re: rdma resource warning on 4.16-rc1 when unloading qedr after NFS mount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Mar 13, 2018, at 9:16 AM, Kalderon, Michal <Michal.Kalderon@xxxxxxxxxx> wrote:
> 
>> From: linux-rdma-owner@xxxxxxxxxxxxxxx [mailto:linux-rdma-
>> owner@xxxxxxxxxxxxxxx] On Behalf Of Kalderon, Michal
>> 
>>> From: Chuck Lever [mailto:chuck.lever@xxxxxxxxxx]
>>> Sent: Wednesday, February 14, 2018 6:58 PM
>>> 
>>> 
>>>> On Feb 14, 2018, at 11:49 AM, Kalderon, Michal
>>> <Michal.Kalderon@xxxxxxxxxx> wrote:
>>>> 
>>>>> From: Leon Romanovsky [mailto:leon@xxxxxxxxxx]
>>>>> Sent: Wednesday, February 14, 2018 6:34 PM
>>>>> To: Chuck Lever <chuck.lever@xxxxxxxxxx>
>>>>> Cc: Kalderon, Michal <Michal.Kalderon@xxxxxxxxxx>; Le, Thong
>>>>> <Thong.Le@xxxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx
>>>>> Subject: Re: rdma resource warning on 4.16-rc1 when unloading qedr
>>>>> after NFS mount
>>>>> 
>>>>> On Wed, Feb 14, 2018 at 11:20:39AM -0500, Chuck Lever wrote:
>>>>>> 
>>>>>> 
>>>>>>> On Feb 14, 2018, at 11:00 AM, Kalderon, Michal
>>>>> <Michal.Kalderon@xxxxxxxxxx> wrote:
>>>>>>> 
>>>>>>> Hi Leon, Chuck,
>>>>>>> 
>>>>>>> We ran nfs mount over qedr using 4.16-rc1 When unloading qedr we
>>>>>>> get a WARNING from the resource tracker ( pasted below)
>>>>>>> 
>>>>>>> Can you please advise on the best way to debug this? How can we
>>>>>>> get
>>>>> more info on the resource not being freed?
>>>>>> 
>>>>>> I haven't seen this kind of report before, so I can't directly
>>>>>> answer your questions. But can you tell us more about reproducing it:
>>>>> 
>>>>> It is resource tracking which was entered in last merge window.
>>>>> 
>>>>>> 
>>>>>> - Is there a workload running on the NFS mount point when the
>>>>>> module is unloaded?
>>>> no
>>>>>> 
>>>>>> - Is the issue 100% reproducible, or intermittent?
>>>> Seems to be
>>>>>> 
>>>>>> - Have you tried bisecting?
>>>> No, bisecting is a tough one here since we ran this scenario to
>>>> verify the last Two related nfs fixes
>>>> e89e8d8 xprtrdma: Fix BUG after a device removal 1179e2c xprtrdma:
>>>> Fix calculation of ri_max_send_sges
>>>> 
>>>>> 
>>>>> It will be one of three patches:
>>>>> 9d5f8c209b3f RDMA/core: Add resource tracking for create and
>>>>> destroy PDs 08f294a1524b RDMA/core: Add resource tracking for
>>>>> create and destroy CQs
>>>>> 78a0cd648a80 RDMA/core: Add resource tracking for create and
>>>>> destroy QPs
>>>> Do you think these could lead to a resource not being freed? Or only
>>>> issues
>>> with tracking?
>>>> 
>>>>> 
>>>>>> 
>>>>>> - iWARP, RoCE, or both?
>>>> Only tested over RoCE for now
>>>>>> 
>>>>>> - Have you tried reproducing with a different model of device?
>>>> no
>>>>> 
>>>>> I doubt that it is related to device, it looks like a resource leak
>>>>> while removing rpcrdma.
>>>>> 
>>>>> We definitely need to add more information to this warning to
>>>>> understand which one of three available resources wasn't freed.
>>>> 
>>>> Missed an output from our driver saying there's a PD not freed. As
>>>> mentioned, due to other Issues we're not sure whether we've seen
>>>> this
>>> message from our driver in the past.
>>> 
>>> When I've tested device unload with rpcrdma.ko, the unload hangs if
>>> rpcrdma.ko doesn't release all resources.
>>> 
>>> rpcrdma_ia_remove() releases transport resources. It destroys the QP
>>> and CQs, but leaves the ID and PD to be destroyed by the device driver or
>> core.
>>> The CM event handler returns 1 to signal this is the case.
>>> 
>>> I suspect it could be a driver bug.
>> Our driver doesn't take care of releasing PDs, it counts on layers above to do
>> so.
>> Why should the PD be treated differently than the CQs/QPs in this case?
>> we will look into this further to understand whether this is newly introduced.
>> thanks
> 
> Hi Chuck, the PD that is not freed here by rpcrdma is freed if we issue a umount.
> 
> Mount: this is the creation of the pd:
> [ 1162.401116]  ? rpcrdma_create_id+0x20b/0x270 [rpcrdma]
> [ 1162.401124]  rpcrdma_ia_open+0x40/0xe0 [rpcrdma]
> [ 1162.401132]  xprt_setup_rdma+0x110/0x3a0 [rpcrdma]
> [ 1162.401147]  xprt_create_transport+0x7d/0x210 [sunrpc]
> [ 1162.401161]  rpc_create+0xc5/0x1c0 [sunrpc]
> 
> Umount: 
> [ 1011.602701]  qedr_dealloc_pd+0x18/0x90 [qedr]
> [ 1011.602709]  ib_dealloc_pd+0x45/0x80 [ib_core]
> [ 1011.602716]  rpcrdma_ia_close+0x57/0x70 [rpcrdma]
> [ 1011.602719]  xprt_rdma_destroy+0x4d/0xb0 [rpcrdma]

That is by design. Whether that design is correct or not remains to be seen.

It wasn't clear to me that deallocating the PD on device removal was
necessary. At least the ID has to stay around until the core removes it.

No-one complained about the missing ib_dealloc_pd during review.

And, since I was able to unload the device driver with the current design,
I thought my assumption about leaving the PD was correct. Under normal
circumstances, with the current kernel, this is still the case, and I don't
see restracker warnings unless the transport is in some pathological state.


> Why not call rpcrdma_ia_close from rpcrdma_ia_remove

rpcrdma_ia_close also destroys the ID.

I suppose that since the actual work of tearing things down is done in
another thread, it would be safe for xprtrdma to destroy the ID itself,
rather than having the core do it once the upcall returns. In at least
one of the prototypes, the tear-down was done in the upcall thread,
so the ID had to be left alone. That aspect of the design has stayed
in the code--perhaps unnecessarily?

Advice on this is welcome!


> Thanks,
> Michal
> 
>> 
>>> 
>>> 
>>>>>>> Thanks,
>>>>>>> Michal
>>>>>>> 
>>>>>>> GAD17990 login: [  300.480137] ib_srpt srpt_remove_one(qedr0):
>>>>>>> nothing
>>>>> to do.
>>>>>>> [  300.515527] ib_srpt srpt_remove_one(qedr1): nothing to do.
>>>>>>> [  300.542182] rpcrdma: removing device qedr1 for
>>>>>>> 192.168.110.146:20049 [  300.573789] WARNING: CPU: 12 PID: 3545
>>>>>>> at
>>>>>>> drivers/infiniband/core/restrack.c:20
>>>>>>> rdma_restrack_clean+0x25/0x30 [ib_core] [  300.625985] Modules
>>>>>>> linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma
>>>>>>> ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi
>>>>>>> ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib
>>>>>>> rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm 8021q
>> garp
>>>>>>> mrp qedr(-) ib_core xt_CHECKSUM iptable_mangle
>> ipt_MASQUERADE
>>>>>>> nf_nat_masquerade_ipv4
>>> iptable_nat
>>>>>>> nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack
>>>>>>> nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc
>>>>>>> ebtable_filter ebtables fuse ip6table_filter ip6_tables
>>>>>>> iptable_filter dm_mirror dm_region_hash dm_log dm_mod vfat fat
>>>>>>> dax intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp
>>> coretemp
>>>>>>> kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
>>>>>>> ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper
>>>>>>> cryptd ipmi_si [  300.972993]  iTCO_wdt ipmi_devintf sg pcspkr
>>>>> iTCO_vendor_support hpwdt hpilo lpc_ich ipmi_msghandler
>> pcc_cpufreq
>>>>> ioatdma i2c_i801 mfd_core wmi shpchp dca acpi_power_meter i2c_core
>>>>> nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c
>>>>> sd_mod qede qed crc32c_intel tg3 hpsa scsi_transport_sas crc8 [
>>> 301.109036] CPU: 12 PID:
>>>>> 3545 Comm: rmmod Not tainted 4.16.0-rc1 #1 [  301.139518] Hardware
>>> name:
>>>>> HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 02/17/2017 [
>>>>> 301.180411] RIP: 0010:rdma_restrack_clean+0x25/0x30 [ib_core] [
>>>>> 301.208350] RSP: 0018:ffffb1820478fe88 EFLAGS: 00010286 [
>>>>> 301.233241]
>>>>> RAX: 0000000000000000 RBX: ffffa099ed1b4070 RCX: ffffdf02a193c800 [
>>>>> 301.268001] RDX: ffffa095ed12d7a0 RSI: 0000000000025900 RDI:
>>>>> ffffa099ed1b47d0 [  301.302530] RBP: ffffa099ed1b4070 R08:
>>>>> ffffa095de9dd000 R09: 0000000180080007 [  301.337245] R10:
>>>>> 0000000000000001 R11: ffffa095de9dd000 R12: ffffa099ed1b4000 [
>>>>> 301.372151] R13: ffffa099ed1b405c R14: 0000000000e231c0 R15:
>>>>> 0000000000e23010 [  301.407384] FS:  00007f2b0c854740(0000)
>>>>> GS:ffffa099ff700000(0000) knlGS:0000000000000000 [  301.447026] CS:
>>>>> 0010
>>>>> DS: 0000 ES: 0000 CR0: 0000000080050033 [  301.475409] CR2:
>>>>> 0000000000e2caf8 CR3: 0000000865c0d006 CR4: 00000000001606e0 [
>>>>> 301.510892] Call Trace:
>>>>>>> [  301.522715]  ib_unregister_device+0xf5/0x190 [ib_core] [
>>>>>>> 301.547966]  qedr_remove+0x37/0x60 [qedr] [  301.568393]
>>>>>>> qede_rdma_unregister_driver+0x4b/0x90 [qede] [  301.594980]
>>>>>>> SyS_delete_module+0x168/0x240 [  301.615057]
>>>>>>> do_syscall_64+0x6f/0x1a0 [  301.633588]
>>>>>>> entry_SYSCALL_64_after_hwframe+0x21/0x86
>>>>>>> [  301.658657] RIP: 0033:0x7f2b0bd33707 [  301.676005] RSP:
>>>>>>> 002b:00007ffdefa29d98 EFLAGS: 00000202 ORIG_RAX:
>>> 00000000000000b0
>>>>> [
>>>>>>> 301.713324] RAX: ffffffffffffffda RBX: 0000000000e231c0 RCX:
>>>>>>> 00007f2b0bd33707 [  301.748186] RDX: 00007f2b0bda3a80 RSI:
>>>>>>> 0000000000000800 RDI: 0000000000e23228 [  301.782960] RBP:
>>>>>>> 0000000000000000 R08: 00007f2b0bff8060 R09: 00007f2b0bda3a80 [
>>>>>>> 301.818142] R10: 00007ffdefa29b20 R11: 0000000000000202 R12:
>>>>>>> 00007ffdefa2b70d [  301.853290] R13: 0000000000000000 R14:
>>>>>>> 0000000000e231c0 R15: 0000000000e23010 [  301.888138] Code: 84 00
>>>>>>> 00
>>>>>>> 00 00 00 0f 1f 44 00 00 48 83 c7 28 31 c0 eb 0c 48 83 c0 08 48 3d
>>>>>>> 00
>>>>>>> 08 00 00 74 0f 48 8d 14 07 48 8b 12 48 85 d2 74 e8 <0f> ff c3 f3
>>>>>>> c3
>>>>>>> 66 0f 1f 44 00 00 0f 1f 44 00 00 53 48 8b 47 28 [  301.981140]
>>>>>>> ---[ end trace 28dec8f15205789a ]---
>>>>>> 
>>>>>> --
>>>>>> Chuck Lever
>>>>>> 
>>>>>> 
>>>>>> 
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
>>>> in the body of a message to majordomo@xxxxxxxxxxxxxxx More
>>> majordomo
>>>> info at  http://vger.kernel.org/majordomo-info.html
>>> 
>>> --
>>> Chuck Lever
>>> 
>>> 
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the
>> body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at
>> http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux