Re: [PATCH v4 2/4] bus: mhi: host: Drop chan lock before queuing buffers

Manivannan Sadhasivam <mani@xxxxxxxxxx> · Thu, 30 Nov 2023 11:01:57 +0530

On Wed, Nov 29, 2023 at 11:29:07AM +0800, Qiang Yu wrote:
> 
> On 11/28/2023 9:32 PM, Manivannan Sadhasivam wrote:
> > On Mon, Nov 27, 2023 at 03:13:55PM +0800, Qiang Yu wrote:
> > > On 11/24/2023 6:04 PM, Manivannan Sadhasivam wrote:
> > > > On Tue, Nov 14, 2023 at 01:27:39PM +0800, Qiang Yu wrote:
> > > > > Ensure read and write locks for the channel are not taken in succession by
> > > > > dropping the read lock from parse_xfer_event() such that a callback given
> > > > > to client can potentially queue buffers and acquire the write lock in that
> > > > > process. Any queueing of buffers should be done without channel read lock
> > > > > acquired as it can result in multiple locks and a soft lockup.
> > > > > 
> > > > Is this patch trying to fix an existing issue in client drivers or a potential
> > > > issue in the future drivers?
> > > > 
> > > > Even if you take care of disabled channels, "mhi_event->lock" acquired during
> > > > mhi_mark_stale_events() can cause deadlock, since event lock is already held by
> > > > mhi_ev_task().
> > > > 
> > > > I'd prefer not to open the window unless this patch is fixing a real issue.
> > > > 
> > > > - Mani
> > > In [PATCH v4 1/4] bus: mhi: host: Add spinlock to protect WP access when
> > > queueing
> > > TREs,  we add
> > > write_lock_bh(&mhi_chan->lock)/write_unlock_bh(&mhi_chan->lock)
> > > in mhi_gen_tre, which may be invoked as part of mhi_queue in client xfer
> > > callback,
> > > so we have to use read_unlock_bh(&mhi_chan->lock) here to avoid acquiring
> > > mhi_chan->lock
> > > twice.
> > > 
> > > Sorry for confusing you. Do you think we need to sqush this two patch into
> > > one?
> > Well, if patch 1 is introducing a potential deadlock, then we should fix patch
> > 1 itself and not introduce a follow up patch.
> > 
> > But there is one more issue that I pointed out in my previous reply.
> Sorry, I can not understand why "mhi_event->lock" acquired during
> mhi_mark_stale_events() can cause deadlock. In mhi_ev_task(), we will
> not invoke mhi_mark_stale_events(). Can you provide some interpretation?

Going by your theory that if a channel gets disabled while processing the event,
the process trying to disable the channel will try to acquire "mhi_event->lock"
which is already held by the process processing the event.

- Mani

> > 
> > Also, I'm planning to cleanup the locking mess within MHI in the coming days.
> > Perhaps we can revisit this series at that point of time. Will that be OK for
> > you?
> Sure, that will be great.
> > 
> > - Mani
> > 
> > > > > Signed-off-by: Qiang Yu <quic_qianyu@xxxxxxxxxxx>
> > > > > ---
> > > > >    drivers/bus/mhi/host/main.c | 4 ++++
> > > > >    1 file changed, 4 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c
> > > > > index 6c6d253..c4215b0 100644
> > > > > --- a/drivers/bus/mhi/host/main.c
> > > > > +++ b/drivers/bus/mhi/host/main.c
> > > > > @@ -642,6 +642,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
> > > > >    			mhi_del_ring_element(mhi_cntrl, tre_ring);
> > > > >    			local_rp = tre_ring->rp;
> > > > > +			read_unlock_bh(&mhi_chan->lock);
> > > > > +
> > > > >    			/* notify client */
> > > > >    			mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result);
> > > > > @@ -667,6 +669,8 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl,
> > > > >    					kfree(buf_info->cb_buf);
> > > > >    				}
> > > > >    			}
> > > > > +
> > > > > +			read_lock_bh(&mhi_chan->lock);
> > > > >    		}
> > > > >    		break;
> > > > >    	} /* CC_EOT */
> > > > > -- 
> > > > > 2.7.4
> > > > > 
> > > > > 
> 

-- 
மணிவண்ணன் சதாசிவம்