Re: [PATCH RFC] dmaengine: dw: Prevent tx-status calling desc callback (Fix UART deadlock!)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi folks

Any comments or suggestion about the change? The kernel occasionally
_deadlocks_ without it for the DW UART + DW DMAC hardware setup.

-Serge(y)

On Fri, Aug 02, 2024 at 11:07:51AM +0300, Serge Semin wrote:
> The dmaengine_tx_status() method updating the DMA-descriptors state and
> eventually calling the Tx-descriptors callback may potentially cause
> problems. In particular the deadlock was discovered in DW UART 8250 device
> interacting with DW DMA controller channels. The call-trace causing the
> deadlock is:
> 
> serial8250_handle_irq()
>   uart_port_lock_irqsave(port); ----------------------+
>   handle_rx_dma()                                     |
>     serial8250_rx_dma_flush()                         |
>       __dma_rx_complete()                             |
>         dmaengine_tx_status()                         |
>           dwc_scan_descriptors()                      |
>             dwc_complete_all()                        |
>               dwc_descriptor_complete()               |
>                 dmaengine_desc_callback_invoke()      |
>                   cb->callback(cb->callback_param);   |
>                   ||                                  |
>                   dma_rx_complete();                  |
>                     uart_port_lock_irqsave(port); ----+ <- Deadlock!
> 
> So in case if the DMA-engine finished working at some point before the
> serial8250_rx_dma_flush() invocation and the respective tasklet hasn't
> been executed yet, then calling the dmaengine_tx_status() will cause the
> DMA-descriptors status update and the Tx-descriptor callback invocation.
> Generalizing the case up: if the dmaengine_tx_status() method callee and
> the Tx-descriptor callback refer to the related critical section, then
> calling dmaengine_tx_status() from the Tx-descriptor callback will
> inevitably cause a deadlock around the guarding lock as it happens in the
> Serial 8250 DMA implementation above. (Note the deadlock doesn't happen
> very often, but can be eventually discovered if the being received data
> size is greater than the Rx DMA-buffer size defined in the 8250_dma.c
> driver. In my case reducing the Rx DMA-buffer size increased the deadlock
> probability.)
> 
> The easiest way to fix the deadlock was to just remove the Tx-descriptors
> state update from the DW DMA-engine Tx-descriptor status method
> implementation, as the most of the DMA-engine drivers imply. After this
> fix is applied the Tx-descriptors status will be only updated in the
> framework of the dwc_scan_descriptors() method called from the tasklet
> handling the deferred DMA-controller IRQ.
> 
> Signed-off-by: Serge Semin <fancer.lancer@xxxxxxxxx
> 
> ---
> 
> Note I have doubts whether it's the best possible solution of the problem
> since the client-driver deadlock is resolved here by fixing the DMA-engine
> provider code. But I failed to find any reasonable solution in the 8250
> DMA implementation.
> 
> Moreover the suggested fix cause a weird outcome - under the high-speed
> and heavy serial transfers the next error is printed to the log sometimes:
> 
> > dma dma0chan0: BUG: XFER bit set, but channel not idle!
> 
> That's why the patch submitted as RFC. Do you have any better idea in mind
> to prevent the nested lock?
> 
> Cc: Viresh Kumar <vireshk@xxxxxxxxxx>
> Cc: Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx>
> CC: Andy Shevchenko <andy@xxxxxxxxxx>
> Cc: Vinod Koul <vkoul@xxxxxxxxxx>
> Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> Cc: Jiri Slaby <jirislaby@xxxxxxxxxx>
> Cc: "Ilpo Järvinen" <ilpo.jarvinen@xxxxxxxxxxxxxxx>
> Cc: dmaengine@xxxxxxxxxxxxxxx
> Cc: linux-serial@xxxxxxxxxxxxxxx
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> ---
>  drivers/dma/dw/core.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
> index 5f7d690e3dba..4b3402156eae 100644
> --- a/drivers/dma/dw/core.c
> +++ b/drivers/dma/dw/core.c
> @@ -925,12 +925,6 @@ dwc_tx_status(struct dma_chan *chan,
>  	struct dw_dma_chan	*dwc = to_dw_dma_chan(chan);
>  	enum dma_status		ret;
>  
> -	ret = dma_cookie_status(chan, cookie, txstate);
> -	if (ret == DMA_COMPLETE)
> -		return ret;
> -
> -	dwc_scan_descriptors(to_dw_dma(chan->device), dwc);
> -
>  	ret = dma_cookie_status(chan, cookie, txstate);
>  	if (ret == DMA_COMPLETE)
>  		return ret;
> -- 
> 2.43.0
> 




[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux PCI]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux