Hi, On Mon, Dec 14, 2020 at 6:29 PM Stephen Boyd <swboyd@xxxxxxxxxxxx> wrote: > > Here's a shortened version: > > CPU0 CPU1 > ---- ---- > setup_fifo_xfer() > geni_se_setup_m_cmd() > <hardware starts transfer> > <transfer completes in hardware> > <hardware sets M_RX_FIFO_WATERMARK_EN in m_irq> > ... > handle_fifo_timeout() > spin_lock_irq(mas->lock) > mas->cur_xfer = NULL > geni_se_cancel_m_cmd() > spin_unlock_irq(mas->lock) > > geni_spi_isr() > spin_lock(mas->lock) > if (m_irq & M_RX_FIFO_WATERMARK_EN) > geni_spi_handle_rx() > mas->cur_xfer NULL dereference! > > Two CPUs also don't really matter but I guess that's fine. OK, replaced it with your version. > > Specifically it should be noted that the RX/TX interrupts are still > > shown asserted even when a CANCEL/ABORT interrupt has asserted. > > Can we have 'TL;DR: Seriously delayed interrupts for RX/TX can lead to > timeout handling setting mas->cur_xfer to NULL.'? Sure, added this. ...but made the super important change that "tl;dr" is more conventionally lower case. :-P > > Let's check for the NULL transfer in the TX and RX cases. > > and reset the watermark or clear out the fifo respectively to put the > hardware back into a sane state. Sure. > > @@ -396,6 +402,17 @@ static void geni_spi_handle_rx(struct spi_geni_master *mas) > > if (rx_last_byte_valid && rx_last_byte_valid < 4) > > rx_bytes -= bytes_per_fifo_word - rx_last_byte_valid; > > } > > + > > + /* Clear out the FIFO and bail if nowhere to put it */ > > + if (mas->cur_xfer == NULL) { > > I think if (!mas->cur_xfer) is more kernel idiomatic, but sure. I've been yelled at both ways, but changed it to your way here. > > + for (i = 0; i < words; i++) > > while (i++ < DIV_ROUND_UP(rx_bytes, bytes_per_fifo_word)) > readl(se->base + SE_GENI_RX_FIFOn); Sure, that's fine. I was marginally worried that the compiler wouldn't know it could optimize the test and would do the divide every time, but I guess that's pretty dang unlikely and also not a place we really care about optimizing a lot. I'm also not a huge fan of relying on loop counters being initted at the start of the function, but I guess it's OK. Changed to your syntax. -Doug