Re: bus: mhi: parse_xfer_event running transfer completion callbacks more than once for a given buffer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2021-08-27 at 16:51 +1200, Paul Davey wrote:
> On Thu, 2021-08-26 at 09:54 -0600, Jeffrey Hugo wrote:
> > On 8/23/2021 12:47 AM, Paul Davey wrote:
> > > Hi Hemant, Jeffery
> > > 
> > > I have some more information after some testing.
> > > 
> > > > > Do you have a log which prints the TRE being processed?
> > > > > Basically i
> > > > > am
> > > > > trying understand this : by the time you get double free
> > > > > issue,
> > > > > is
> > > > > there
> > > > > any pattern with respect to the TRE that is being processed.
> > > > > For
> > > > > example
> > > > > when host processed the given TRE for the first time with
> > > > > RP1,
> > > > > stale
> > > > > TRE
> > > > > was posted by Event RP2 right after RP1
> > > > > 
> > > > > ->RP1 [TRE1]
> > > > > ->RP2 [TRE1]
> > > > > 
> > > > > or occurrence of stale TRE event is random?
> > > 
> > > [...]
> > 
> 
> Secondly we saw the following pattern in completion events:
> 
> mhi mhi0: (IP_HW0_MBIM-Up) Completion Event code: 2 length: 5e2 ptr:
> 7c4004e0
> mhi mhi0: (IP_HW0_MBIM-Up) Completion Event code: 2 length: 5e2 ptr:
> 7c400520
> mhi mhi0: (IP_HW0_MBIM-Up) Completion Event code: 2 length: 5e2 ptr:
> 7c4004c0
> mhi mhi0: (IP_HW0_MBIM-Up) Completion Event code: 2 length: 5e2 ptr:
> 7c4004b0
> mhi mhi0: (IP_HW0_MBIM-Up) Completion Event code: 2 length: 5e2 ptr:
> 7c4004a0
> 
> Here we can see that instead of a completion event for 7c4004d0 we
> have
> one for 7c400520 which is significantly ahead of the other point and
> from the list of TREs I store in mhi_gen_tre I suspect that 7c400520
> is
> the next TRE to be used in the TRE ring at this time, as the other
> information shows it would be the oldest entry in that list.  I am
> not
> sure what could have caused this but this is a different case to the
> modem repeating the same TRE in a completion event.
> 


I have considered the code further, and while I have seen cases of
identical TRE completion events occurring, I do not think these result
in the double free case because if the event is actually the same as
the last one then the new dev_rp which parse_xfer_event will attempt to
advance the local_rp to will already be equal to the local_rp and the
whole loop will be skipped in the first place.  The troublesome
behaviour comes from the case I describe above where a jump to a
"future" TRE's completion event seems to occur followed by a
continuation of the order, this results in the tre_ring rp being
advanced to that future TRE and then the next completion event
following the previous ordered pattern would be before that rp location
and the loop will run through the entire tre_ring to reach the new rp
location. 


I do have another question though, the driver code seems to in some
cases take the mhi_chan->lock when updating the doorbell register, but
not when queueing new transfers.  What is the actual purpose of this
lock and why does it seem so inconsistently used?  Is there any chance
that some of my problems may be the result of queueing new transfers
racing somehow with completion event processing?

Thanks,
Paul




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [Linux for Sparc]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux