On Fri, Nov 13, 2020 at 10:08 AM Vignesh Raghavendra <vigneshr@xxxxxx> wrote: > > Does the transfer resume if you manually updated WCNT field to a > value > 32 using devmem2 when slave appears to be "stuck"? > Unfortunately, no, the slave does not become unstuck. I can see that the WCNT field typically drops back to zero but nothing appears to happen. > > > Could you see if below diff helps? This delays enabling of channel > until TX DMA is queued so that WCNT does not decrement > > > diff --git a/drivers/spi/spi-omap2-mcspi.c b/drivers/spi/spi-omap2-mcspi.c > index d4c9510af393..bf8c6526bcd7 100644 > --- a/drivers/spi/spi-omap2-mcspi.c > +++ b/drivers/spi/spi-omap2-mcspi.c > @@ -426,6 +426,8 @@ static void omap2_mcspi_tx_dma(struct spi_device *spi, > } > dma_async_issue_pending(mcspi_dma->dma_tx); > omap2_mcspi_set_dma_req(spi, 0, 1); > + if (spi_controller_is_slave(master)) > + omap2_mcspi_set_enable(spi, 1); > } > > static unsigned > @@ -1194,7 +1196,9 @@ static int omap2_mcspi_transfer_one(struct spi_master *master, > master->can_dma(master, spi, t)) > omap2_mcspi_set_fifo(spi, t, 1); > > - omap2_mcspi_set_enable(spi, 1); > + /* For slave TX, enable after DMA is queued */ > + if (!spi_controller_is_slave(master) || !t->tx_buf) > + omap2_mcspi_set_enable(spi, 1); > > /* RX_ONLY mode needs dummy data in TX reg */ > if (t->tx_buf == NULL) I made this change and initially thought it improved things but as the clock speed is increased (> 10000 Hz) it reverts to the prior behavior. I can almost see the SPI slave stutter/stammer along until it finally stops responding to the SPI master. At this point WCNT == 0 and poking in a value > 32 has no effect. From my testing I can definitely see a dramatic decline in performance (or susceptibility for the SPI slave to get "stuck") as the clock rate increases. For my use case, the SPI slave generates telemetry and thus discards all incoming data from the MOSI pin. Likewise my SPI master is only interested in the MISO data. I hacked up the spi-pipe application to provide no TX buffer for the SPI master (thus clocking out zeros to the SPI slave). Likewise for the SPI slave the spi-pipe program does not provide an RX buffer since the clock is merely being used to clock "out" telemetry data to the SPI master. Making this change offers some improvement but it usually doesn't last long. It certainly seems there is one (or more) race conditions. Very rarely, a test will run indefinitely but in general it's not repeatable. It seems that the SPI slave must be able to atomically submit a TX buffer in order for this to work. Given your patch, it appears it must be difficult to thread in successive TX buffers since the DMA must be scheduled and WCNT set appropriately in order to clock out data from the slave. Of course this all has to happen while the SPI master randomly clocks data to/from the slave. Perhaps if I had a better understanding of the normal program flow I could see this more clearly. Do you have any additional suggestions I could investigate? Certainly, this problem is easy to recreate with two development boards (or BeagleBone Blacks in my case). Have you ever encountered it in your testing? Thanks, Glenn