Re: [PATCH v3] iwcm: don't hold the irq disabled lock on iw_rem_ref

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tuesday, September 09/17/19, 2019 at 22:50:27 +0530, Sagi Grimberg wrote:
> 
> >> To: "Krishnamraju Eraparaju" <krishna2@xxxxxxxxxxx>
> >> From: "Jason Gunthorpe" <jgg@xxxxxxxx>
> >> Date: 09/16/2019 06:28PM
> >> Cc: "Steve Wise" <larrystevenwise@xxxxxxxxx>, "Bernard Metzler"
> >> <BMT@xxxxxxxxxxxxxx>, "Sagi Grimberg" <sagi@xxxxxxxxxxx>,
> >> "linux-rdma@xxxxxxxxxxxxxxx" <linux-rdma@xxxxxxxxxxxxxxx>
> >> Subject: [EXTERNAL] Re: Re: [PATCH v3] iwcm: don't hold the irq
> >> disabled lock on iw_rem_ref
> >>
> >> On Wed, Sep 11, 2019 at 09:28:16PM +0530, Krishnamraju Eraparaju
> >> wrote:
> >>> Hi Steve & Bernard,
> >>>
> >>> Thanks for the review comments.
> >>> I will do those formating changes.
> >>
> >> I don't see anything in patchworks, but the consensus is to drop
> >> Sagi's patch pending this future patch?
> >>
> >> Jason
> >>
> > This is my impression as well. But consensus should be
> > explicit...Sagi, what do you think?
> 
> I don't really care, but given the changes from Krishnamraju cause other
> problems I'd ask if my version is also offending his test.
Hi Sagi,
Your version holds good for rdma_destroy_id() path only, but there are
other places where iw_rem_ref() is being called within spinlocks, here is
the call trace when iw_rem_ref() is called in cm_close_handler() path:
[  627.716339] Call Trace:
[  627.716339]  ? load_new_mm_cr3+0xc0/0xc0
[  627.716339]  on_each_cpu+0x23/0x40
[  627.716339]  flush_tlb_kernel_range+0x74/0x80
[  627.716340]  free_unmap_vmap_area+0x2a/0x40
[  627.716340]  remove_vm_area+0x59/0x60
[  627.716340]  __vunmap+0x46/0x210
[  627.716340]  siw_free_qp+0x88/0x120 [siw]
[  627.716340]  cm_work_handler+0x5b8/0xd00  <=====
[  627.716340]  process_one_work+0x155/0x380
[  627.716341]  worker_thread+0x41/0x3b0
[  627.716341]  kthread+0xf3/0x130
[  627.716341]  ? process_one_work+0x380/0x380
[  627.716341]  ? kthread_bind+0x10/0x10
[  627.716341]  ret_from_fork+0x35/0x40

Hence, I moved all the occurances of iw_rem_ref() outside of spinlocks
critical section.
> 
> In general, I do not think that making resources free routines (both
> explict or implicit via ref dec) under a spinlock is not a robust
> design.
> 
> I would first make it clear and documented what cm_id_priv->lock is
> protecting. In my mind, it should protect *its own* mutations of
> cm_id_priv and by design leave all the ops calls outside the lock.
> 
> I don't understand what is causing the Chelsio issue observed, but
> it looks like c4iw_destroy_qp blocks on a completion that depends on a
> refcount that is taken also by iwcm, which means that I cannot call
> ib_destroy_qp if the cm is not destroyed as well?
> 
> If that is madatory, I'd say that instead of blocking on this
> completion, we can simply convert c4iw_qp_rem_ref if use a kref
> which is not order dependent.

I agree,
I'm exploring best possible ways to address this issue, will update my
final resolution by this Friday.



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux