Re: Possible regression in 8250_dw driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 







Richard Tresidder








**






On 24/05/2023 6:06 pm, Ilpo Järvinen wrote:

On Wed, 24 May 2023, Richard Tresidder wrote:
On 23/05/2023 7:13 pm, Ilpo Järvinen wrote:
On Tue, 23 May 2023, Richard Tresidder wrote:

background:
I'm attempting to use 6.3.3 as a new base for one of our systems.
Previously it was using 5.1.7 as a base.
The uart in question is one of the two in the Cyclone V SOC HPS.
And to muddy the waters the linux console TTYS0 is the other Uart from the
same HPS core
Yet the console appears to be working ok.
Maybe some of the DMA related changes triggering some unexpected behavior.

Console doesn't use DMA so that could explain the difference.

Note all other libs and apps are at the same revision and build, it is
only
the kernel that is different.
Both versions of the kernel are also built using the same bitbake bsdk..


As you can see it looks like the frame thats received on the 6.3.3 kernel
is
mangled?
This same message is just being requested over and over again from the GPS
unit.

The offset where the tears occur looks to be pretty similar between each
poll
request.
Usually the 1 at the end of the (75331 is where the first tear occurs.

I'd appreciate some quidance in how to track this down as there appears to
have been a reasonable amount of work done to this driver and the serial
core
between these two versions.
A few ideas:
- try without dma_rx_complete() calling p->dma->rx_dma(p)
- revert 90b8596ac46043e4a782d9111f5b285251b13756
- Try the revert in
https://lore.kernel.org/all/316ab583-d217-a332-d161-8225b0cee227@xxxxxxxxxx/2-0001-Revert-serial-8250-use-THRE-__stop_tx-also-with-DMA.patch
    (for e8ffbb71f783 and f8d6e9d3ca5c)

But finding the culprit with git bisect would be the most helpful here.

Bisect wasn't an easy option as I'd applied a pile of patches after the
interesting commits for our system to run.
I'm not a git master :)

So I just started reverting the patches that had been applied to the 8250
folder.
Worked backwards from head.

After reverting 57e9af7831dcf211c5c689c2a6f209f4abdf0bce
serial: 8250_dma: Fix DMA Rx rearm race

Things started to work again.

I reset everything and then just reverted that individual patch and things
work.
So that looks like the culprit..
Okay, that also worked great, thanks a lot. It gives something concrete to
work with to find a fix.

The commit itself looks very much correct, however, my guess is that when
serial8250_rx_dma_flush() does dmaengine_pause() dma_status somehow
does not become DMA_PAUSED which leads to not properly flushing DMA Rx
transaction. Can you try the following debug patch (if my guess is
correct, besides triggering the WARN_ON_ONCE, it also works around the
issue):

[PATCH] DEBUG: 8250_dma: Detect DMA_PAUSED vs DMA_IN_PROGRESS inconsistency

Detect DMA state not returning DMA_PAUSED after dmaengine_pause() was
issued.

Not intended for upstream.

Thanks Ilpo
   I can confirm that the patch below works. Got the WARN_ON_ONCE dump so it's taking that path.
I think the problem here is that the pl330 DMA controller thats in these SOC's doesn't "really" support pause according to the driver.
It doesn't look like it can ever set "DMA_PAUSED"
I'm not sure of the flow through of that though.
There is some commenting in the pl330 driver about the pause routine.

Cheers
   Richard Tresidder

---
  drivers/tty/serial/8250/8250_dma.c | 14 +++++++++-----
  1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/tty/serial/8250/8250_dma.c b/drivers/tty/serial/8250/8250_dma.c
index 7fa66501792d..3dedacd9f104 100644
--- a/drivers/tty/serial/8250/8250_dma.c
+++ b/drivers/tty/serial/8250/8250_dma.c
@@ -38,7 +38,7 @@ static void __dma_tx_complete(void *param)
  	spin_unlock_irqrestore(&p->port.lock, flags);
  }
-static void __dma_rx_complete(struct uart_8250_port *p)
+static void __dma_rx_complete(struct uart_8250_port *p, bool paused)
  {
  	struct uart_8250_dma	*dma = p->dma;
  	struct tty_port		*tty_port = &p->port.state->port;
@@ -52,8 +52,12 @@ static void __dma_rx_complete(struct uart_8250_port *p)
  	 * anything in such case.
  	 */
  	dma_status = dmaengine_tx_status(dma->rxchan, dma->rx_cookie, &state);
-	if (dma_status == DMA_IN_PROGRESS)
-		return;
+	if (dma_status == DMA_IN_PROGRESS) {
+		if (!paused)
+			return;
+
+		WARN_ON_ONCE(paused);
+	}
count = dma->rx_size - state.residue; @@ -72,7 +76,7 @@ static void dma_rx_complete(void *param) spin_lock_irqsave(&p->port.lock, flags);
  	if (dma->rx_running)
-		__dma_rx_complete(p);
+		__dma_rx_complete(p, false);
/*
  	 * Cannot be combined with the previous check because __dma_rx_complete()
@@ -172,7 +176,7 @@ void serial8250_rx_dma_flush(struct uart_8250_port *p)
if (dma->rx_running) {
  		dmaengine_pause(dma->rxchan);
-		__dma_rx_complete(p);
+		__dma_rx_complete(p, true);
  		dmaengine_terminate_async(dma->rxchan);
  	}
  }




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux PPP]     [Linux FS]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Linmodem]     [Device Mapper]     [Linux Kernel for ARM]

  Powered by Linux