Hi, On 3/17/23 11:30, Ilpo Järvinen wrote: > Hans de Goede reported Bluetooth adapters (HCIs) connected over an UART > connection failed due corrupted Rx payload. The problem was narrowed > down to DMA Rx starting on UART_IIR_THRI interrupt. The problem occurs > despite LSR having DR bit set, which is precondition for attempting to > start DMA Rx in the first place. > > From a debug patch: > [x.807834] 8250irq: iir=cc lsr+saved=60 received=0/15 ier=0f dma_t/rx/err=0/0/0 > [x.808676] 8250irq: iir=c2 lsr+saved=61 received=0/0 ier=0f dma_t/rx/err=0/0/0 > [x.808776] 8250irq: iir=cc lsr+saved=60 received=1/12 ier=0d dma_t/rx/err=0/1/0 > [x.808870] Bluetooth: hci0: Frame reassembly failed (-84) > > In the debug snippet, received field indicates 1 byte was transferred > over DMA and 12 bytes after that with the non-DMA Rx. The sole byte DMA > handled was corrupted (gets zeroed) which leads to the HCI failure. > > This problem became apparent after commit e8ffbb71f783 ("serial: 8250: > use THRE & __stop_tx also with DMA") changed Tx stop behavior. Tx stop > is now triggered from a THRI interrupt. > > Despite that this problem looks like a HW bug, this fix is not adding > UART_BUG_xx flag to the driver beucase it seems useful in general to > avoid starting DMA when there are only a few bytes to transfer. > Skipping DMA for small transfers avoids the extra overhead DMA incurs. > > Thus, don't setup DMA Rx on UART_IIR_THRI but leave it to a subsequent > interrupt which has Rx a related IIR value. > > By returning false from handle_rx_dma(), the DMA vs non-DMA decision is > postponed until either UART_IIR_RDI (FIFO threshold worth of bytes > awaiting) or UART_IIR_TIMEOUT (inter-character timeout) triggers at a > later time which allows better to discern whether the number of bytes > warrants starting DMA or not. > > Reported-by: Hans de Goede <hdegoede@xxxxxxxxxx> > Tested-by: Hans de Goede <hdegoede@xxxxxxxxxx> > Fixes: e8ffbb71f783 ("serial: 8250: use THRE & __stop_tx also with DMA") > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx> Thanks, patch looks good to me: Acked-by: Hans de Goede <hdegoede@xxxxxxxxxx> Regards, Hans > --- > drivers/tty/serial/8250/8250_port.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c > index fa43df05342b..3ba9c8b93ae6 100644 > --- a/drivers/tty/serial/8250/8250_port.c > +++ b/drivers/tty/serial/8250/8250_port.c > @@ -1903,6 +1903,17 @@ EXPORT_SYMBOL_GPL(serial8250_modem_status); > static bool handle_rx_dma(struct uart_8250_port *up, unsigned int iir) > { > switch (iir & 0x3f) { > + case UART_IIR_THRI: > + /* > + * Postpone DMA or not decision to IIR_RDI or IIR_RX_TIMEOUT > + * because it's impossible to do an informed decision about > + * that with IIR_THRI. > + * > + * This also fixes one known DMA Rx corruption issue where > + * DR is asserted but DMA Rx only gets a corrupted zero byte > + * (too early DR?). > + */ > + return false; > case UART_IIR_RDI: > if (!up->dma->rx_running) > break;