Re: [PATCH 1/1] tty: serial: fsl_lpuart: optimize the timer based EOP logic

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2 May 2023, Shenwei Wang wrote:

> At low baud rates, the DMA transfer may end prematurely due to the timer,
> even during an active transfer. This does not accurately simulate an EOP
> event as intended. We expect the timer to only complete a DMA transfer
> once the idle period satisfies a specified interval.
> 
> The patch checks the DMA residue count before copying data to the TTY
> buffer. If the residue count remains unchanged since the last interrupt,
> that indicates no new data was received. In this case, the DMA should
> complete as an EOP event. Instead, the timer restarts.

This description is lacking something. It does not explain why the stuff 
in second paragraph is necessary at all as setting a longer timer based on
the (lower) baud rate would avoid the need to do the timer restart.

> Signed-off-by: Shenwei Wang <shenwei.wang@xxxxxxx>
> ---
>  drivers/tty/serial/fsl_lpuart.c | 52 ++++++++++++++++++++++++++++++---
>  1 file changed, 48 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/tty/serial/fsl_lpuart.c b/drivers/tty/serial/fsl_lpuart.c
> index c91916e13648..8d21351fb3bd 100644
> --- a/drivers/tty/serial/fsl_lpuart.c
> +++ b/drivers/tty/serial/fsl_lpuart.c
> @@ -238,6 +238,7 @@
>  
>  /* Rx DMA timeout in ms, which is used to calculate Rx ring buffer size */
>  #define DMA_RX_TIMEOUT		(10)
> +#define DMA_RX_IDLE_CHARS	(8)
>  #define UART_AUTOSUSPEND_TIMEOUT	3000
>  
>  #define DRIVER_NAME	"fsl-lpuart"
> @@ -282,6 +283,7 @@ struct lpuart_port {
>  	struct scatterlist	rx_sgl, tx_sgl[2];
>  	struct circ_buf		rx_ring;
>  	int			rx_dma_rng_buf_len;
> +	int                     last_residue;
>  	unsigned int		dma_tx_nents;
>  	wait_queue_head_t	dma_wait;
>  	bool			is_cs7; /* Set to true when character size is 7 */
> @@ -331,7 +333,7 @@ static struct lpuart_soc_data imx8qxp_data = {
>  	.devtype = IMX8QXP_LPUART,
>  	.iotype = UPIO_MEM32,
>  	.reg_off = IMX_REG_OFF,
> -	.rx_watermark = 31,
> +	.rx_watermark = 8, /* A lower watermark is ideal for low baud rates. */
>  };
>  static struct lpuart_soc_data imxrt1050_data = {
>  	.devtype = IMXRT1050_LPUART,
> @@ -1255,6 +1257,8 @@ static void lpuart_copy_rx_to_tty(struct lpuart_port *sport)
>  		sport->port.icount.rx += copied;
>  	}
>  
> +	sport->last_residue = state.residue;
> +
>  exit:
>  	dma_sync_sg_for_device(chan->device->dev, &sport->rx_sgl, 1,
>  			       DMA_FROM_DEVICE);
> @@ -1272,11 +1276,40 @@ static void lpuart_dma_rx_complete(void *arg)
>  	lpuart_copy_rx_to_tty(sport);
>  }
>  
> +/*
> + * Timer function to simulate the hardware EOP(End Of Package) event.

Missing space

> + * The timer callback is to check for new RX data and copy to TTY buffer.
> + * If no new data since last interrupt, restart timer. Otherwise, copy data
> + * and continue normal logic.
> + */
>  static void lpuart_timer_func(struct timer_list *t)
>  {
>  	struct lpuart_port *sport = from_timer(sport, t, lpuart_timer);
> +	struct dma_chan *chan = sport->dma_rx_chan;
> +	struct circ_buf *ring = &sport->rx_ring;
> +	struct dma_tx_state state;
> +	unsigned long flags;
> +	int count = 0;
>  
> -	lpuart_copy_rx_to_tty(sport);
> +	dmaengine_tx_status(chan, sport->dma_rx_cookie, &state);

> +	ring->head = sport->rx_sgl.length - state.residue;
> +
> +	if (ring->head < ring->tail)
> +		count = sport->rx_sgl.length - ring->tail;
> +	else if (ring->tail < ring->head)
> +		count = ring->head - ring->tail;

linux/circ_buf.h has functions which likely handle what you want to do 
here. They will get you true count across wrap too which this above does 
not do.

Given this is essentially duplicates count calculation some refactor 
would seem more useful here rather than recalculating the count again in 
lpuart_copy_rx_to_tty().

Also lpuart_handle_sysrq() duplicates the same calculations.

> +
> +	/* Check if new data received before copying */
> +	if ((count != 0) && (sport->last_residue == state.residue))

I'm unsure about this condition being right.

What will happen when rx_sgl.length (or -1 of that, I'm not sure which way 
"full size" is here) worth of data has been DMA'ed. Does this condition 
end up delaying copy such that it's done only on every other call here?

Also, should you reset last_residue in lpuart_start_rx_dma() ? I think 
that would solve the "full size" problem.

> +		lpuart_copy_rx_to_tty(sport);
> +	else
> +		mod_timer(&sport->lpuart_timer,
> +				jiffies + sport->dma_rx_timeout);
> +
> +	if (spin_trylock_irqsave(&sport->port.lock, flags)) {
> +		sport->last_residue = state.residue;
> +		spin_unlock_irqrestore(&sport->port.lock, flags);
> +	}
>  }
>  
>  static inline int lpuart_start_rx_dma(struct lpuart_port *sport)
> @@ -1297,9 +1330,19 @@ static inline int lpuart_start_rx_dma(struct lpuart_port *sport)
>  	 */
>  	sport->rx_dma_rng_buf_len = (DMA_RX_TIMEOUT * baud /  bits / 1000) * 2;
>  	sport->rx_dma_rng_buf_len = (1 << fls(sport->rx_dma_rng_buf_len));
> +	if (sport->rx_dma_rng_buf_len < sport->rxfifo_size * 2)
> +		sport->rx_dma_rng_buf_len = sport->rxfifo_size * 2;

max_t()

> +
> +	/*
> +	 * Keep this condition check in case rxfifo_size is unavailable
> +	 * for some SoCs.
> +	 */
>  	if (sport->rx_dma_rng_buf_len < 16)
>  		sport->rx_dma_rng_buf_len = 16;
>  
> +	sport->dma_rx_timeout =
> +		msecs_to_jiffies((1000 * 10 * DMA_RX_IDLE_CHARS) / baud + 1);

There's ->frame_time these days in uart_port which you should base frame 
timing related calculations. I wouldn't mind if that existing ->frame_time 
math that is visible in your patch's context is also converted (in a 
separate patch).

I'm assuming that magic 10 is assumed number of bits and 1000 
MSEC_PER_SEC. That +1 seems odd, did you mean DIV_ROUND_UP() ?

> +
>  	ring->buf = kzalloc(sport->rx_dma_rng_buf_len, GFP_ATOMIC);
>  	if (!ring->buf)
>  		return -ENOMEM;
> @@ -1687,12 +1730,13 @@ static void lpuart_rx_dma_startup(struct lpuart_port *sport)
>  	if (!sport->dma_rx_chan)
>  		goto err;
>  
> +	/* set default Rx DMA timeout */
> +	sport->dma_rx_timeout = msecs_to_jiffies(DMA_RX_TIMEOUT);
> +
>  	ret = lpuart_start_rx_dma(sport);
>  	if (ret)
>  		goto err;
>  
> -	/* set Rx DMA timeout */
> -	sport->dma_rx_timeout = msecs_to_jiffies(DMA_RX_TIMEOUT);
>  	if (!sport->dma_rx_timeout)
>  		sport->dma_rx_timeout = 1;
>  
> 

-- 
 i.




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux PPP]     [Linux FS]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Linmodem]     [Device Mapper]     [Linux Kernel for ARM]

  Powered by Linux