Re: Possible regression in 8250_dw driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 26 May 2023, Richard Tresidder wrote:

> On 25/05/2023 5:38 pm, Ilpo Järvinen wrote:
> 
> > On Thu, 25 May 2023, Richard Tresidder wrote:
> > > On 25/05/2023 4:43 pm, Ilpo Järvinen wrote:
> > > > On Thu, 25 May 2023, Richard Tresidder wrote:
> > > > > On 24/05/2023 6:06 pm, Ilpo Järvinen wrote:
> > > > > > On Wed, 24 May 2023, Richard Tresidder wrote:
> > > > > > > So I just started reverting the patches that had been applied to
> > > > > > > the
> > > > > > > 8250
> > > > > > > folder.
> > > > > > > Worked backwards from head.
> > > > > > > 
> > > > > > > After reverting 57e9af7831dcf211c5c689c2a6f209f4abdf0bce
> > > > > > > serial: 8250_dma: Fix DMA Rx rearm race
> > > > > > > 
> > > > > > > Things started to work again.
> > > > > > > 
> > > > > > > I reset everything and then just reverted that individual patch
> > > > > > > and
> > > > > > > things
> > > > > > > work.
> > > > > > > So that looks like the culprit..
> > > > > > Okay, that also worked great, thanks a lot. It gives something
> > > > > > concrete
> > > > > > to
> > > > > > work with to find a fix.
> > > > > > 
> > > > > > The commit itself looks very much correct, however, my guess is that
> > > > > > when
> > > > > > serial8250_rx_dma_flush() does dmaengine_pause() dma_status somehow
> > > > > > does not become DMA_PAUSED which leads to not properly flushing DMA
> > > > > > Rx
> > > > > > transaction. Can you try the following debug patch (if my guess is
> > > > > > correct, besides triggering the WARN_ON_ONCE, it also works around
> > > > > > the
> > > > > > issue):
> > > > > > 
> > > > > > [PATCH] DEBUG: 8250_dma: Detect DMA_PAUSED vs DMA_IN_PROGRESS
> > > > > > inconsistency
> > > > > > 
> > > > > > Detect DMA state not returning DMA_PAUSED after dmaengine_pause()
> > > > > > was
> > > > > > issued.
> > > > > > 
> > > > > > Not intended for upstream.
> > > > > Thanks Ilpo
> > > > >      I can confirm that the patch below works. Got the WARN_ON_ONCE
> > > > > dump so
> > > > > it's
> > > > > taking that path.
> > > > > I think the problem here is that the pl330 DMA controller thats in
> > > > > these
> > > > > SOC's
> > > > > doesn't "really" support pause according to the driver.
> > > > For this flush usecase, it is enough to support pause + read residue +
> > > > terminate which is supported according to the function comment for
> > > > pl330_pause().
> > > Yep agree given the 8250 serial code immediately terminates the dma after
> > > the
> > > pause during the flush anyway..
> > > 
> > > > > It doesn't look like it can ever set "DMA_PAUSED"
> > > > Why not? It pauses the transfer and is even able to allow reading the
> > > > transferred byte count.
> > > > 
> > > > What it is claimed to not be able to do is resume. Note that w/o resume
> > > > serial8250_tx_dma() XON/XOFF processing will not work but that's
> > > > unrelated to this issue (I'll probably need to do another patch to fix
> > > > that on 8250 DMA side).
> > > Yep I just meant it doesn't look capable of reporting DMA_PAUSED
> > > Which your patch below probably fixes.
> > > and it kind of does a stop without a resume capability so must be
> > > terminated
> > > after this.
> > > Though I'm not sure of the implications of reporting paused without resume
> > > capability.
> > > Kind of an odd one..
> > dma_slave_caps has a bool for both pause and resume support (I only found
> > out about the second flag for resume today) so not unheard of it seems.
> 
> I've just tested the pl330 patch and it looks good.
> No longer get the WARN_ON_ONCE triggered in the 8250_dma driver so it looks
> like
> Paused is correctly being set now and we're avoiding that path.

Thanks a lot for testing the fix and also for your invaluable help in 
narrowing down the problem. It would have been very hard to find the 
culprit otherwise.

Can I add your Tested-by tag before sending the official version of the 
fix out?

> Wonder how many others dma drivers are afflicted in the same manner?
> A quick grep seems to infer a lot more drivers set the pause function then
> have
> something in them setting the status to DMA_PAUSED so probably good to keep
> the
> 8250 fix in also for the time being.

There might be others, but then the patch which covered the issue has gone
to stables already months ago and you're the first to report it AFAIK so 
perhaps not that wide-spread.

I'll think how to deal with this on 8250_dma side.


-- 
 i.

[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux PPP]     [Linux FS]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Linmodem]     [Device Mapper]     [Linux Kernel for ARM]

  Powered by Linux