On Mon, 2020-11-09 at 22:59 +0300, Serge Semin wrote: > On Sat, Nov 07, 2020 at 05:13:52PM +0900, Damien Le Moal wrote: > > With boards that have slow interrupts context switch, and a fast device > > connected to a spi master, e.g. an SD card through mmc-spi, > > > using > > dw_spi_poll_transfer() intead of the regular interrupt based > > dw_spi_transfer_handler() function is more efficient and > > I can believe in that. But the next part seems questionable: > > > can avoid a lot > > of RX FIFO overflow errors while keeping the device SPI frequency > > reasonnably high (for speed). > > No matter whether it's an IRQ-based or poll-based transfer, as long as a > client SPI-device is connected with a GPIO-based chip-select (or the > DW APB SSI-controller feature of the automatic chip-select toggling is > fixed), the Rx FIFO should never overrun. It's ensured by the transfer > algorithm design by calculating the rxtx_gap in the dw_writer() > method. If the error still happens then there must be some bug in > the code. > > It's also strange to hear that the polling-based transfer helps > to avoid the Rx FIFO overflow errors, while the IRQ-based transfer > causes them. Both of those methods are based on the same dw_writer() > and dw_reader() methods. So basically they both should either work > well or cause the errors at same time. > > So to speak could you more through debug your case? I did. And I have much better results now. Let me explain: 1) The device tree was setting up the SPI controller using the controller internal chip select, not a GPIO-based chip select. Until now, I could never get the GPIO-based chip select to work. I finally found out why: I simply needed to add the "spi-cs-high" property to the mmc-slot node. With that, the CS gpio is correctly driven active-high instead of the default active-low and the SD card works. 2) With this change using the GPIO-based CS, the patch "spi: dw: Fix driving MOSI low while receiving" became completely unnecessary. The SD card works without it. Now for testing, I also removed this polling change. Results are these: 1) With the same SPI frequency as before (4MHz), I can run the SD card at about 300 KB/s (read) but I am still seeing some RX FIFO overflow errors. Not a lot, but enough to be annoying, especially on boot as the partition scan sometimes fails because of these errors. In most cases, the block layer retry of failed read/writes cover and no bad errors happen, but the RX FIFO overflow error messages still pop up. 2) Looking into the code further, I realized that RXFLTR is set to half the fifo size minus 1. That sound reasonable, but as that delays interrupt generation until the RX fifo is almost full, I decided to try a value of 0 to get the interrupt as soon as data is available rather than waiting for a chunk. With that, all RX FIFO overflow errors go away, and I could even double the SPI frequency to 8MHz, getting a solid 650KB/s from the SD card without any error at all. My take: * This controller internal CS seems to be totally broken. * This SoC has really slow interrupts, so generating these earlier rather than later gives time to the IRQ handler to kick in before the FIFO overflows. In the V2 series for SPI DW, I added a DW_SPI_CAP_RXFLTR_CLEAR capability flag to set RXFLTR to 0, always. That works well, but this is may be still similar to the "polling" hack in the sense that it is tuning for this SoC rather than a property of the controller. But I do not see any other simple way of removing these annoying RX FIFO overflow errors. > On the other hand the errors (but not the Rx FIFO overflow) might be > caused by the DW APB SSI nasty feature of the native chip-select > automatic assertion/de-assertion. So if your MMC-spi port is handled > by the native DW APB SSI CS lane, then it won't work well for sure. > No matter whether you use the poll- or IRQ-based transfers. Indeed. The GPIO-based CS does behave much more nicely, and it does not require that "drive MOSI line high" hack. But for reasons that I still do not clearly understand, occasional RX FIFO overflow errors still show up. Thanks for all the useful comments ! -- Damien Le Moal Western Digital