On Thu, Jan 7, 2021 at 2:38 AM Mark Brown <broonie@xxxxxxxxxx> wrote: > I wouldn't have thought that would be something in your DT, unless it > was previously buggy and some issue has been exposed, but then nothing > springs to mind as being potentially relevant either. I've started reading spi-bcm2835.c, and while I cannot claim that I understand everything I'm reading, it raises some flags: - it does not use "spi_finalize_current_transfer(...)" at all, but rather "complete(&...->xfer_completion)". The former only calls the latter, so this code seems technically correct, but this looks like an abstraction layer bust. - while it uses "complete(...)" on its IRQ and DMA transfer codepaths, I do not see it being called on its polled codepath (bcm2835_spi_transfer_one_poll). - ...polled codepath which checks bs->rx_len to tell when it's done, independently of transfer direction. And while tx_len and rx_len are initialised to the same value, only the field actually corresponding to the actual transfer direction will be updated within this polling loop. So maybe some transfers could timeout in the polled codepath and would still end up in the IRQ one which would end up calling "complete", but this looks suspicious. Checking on 5.10.4, I see: root@sushi:/sys/kernel/debug/spi-bcm2835-20204000.spi# cat count_transfer_dma 2 root@sushi:/sys/kernel/debug/spi-bcm2835-20204000.spi# cat count_transfer_irq 1 root@sushi:/sys/kernel/debug/spi-bcm2835-20204000.spi# cat count_transfer_polling 27 root@sushi:/sys/kernel/debug/spi-bcm2835-20204000.spi# cat count_transfer_irq_after_polling 0 so I am apparently not triggering the poll-then-IRQ case, but am triggering the others on this kernel version. On 5.11, this becomes: root@sushi:/sys/kernel/debug/spi-bcm2835-20204000.spi# cat count_transfer_dma 2 root@sushi:/sys/kernel/debug/spi-bcm2835-20204000.spi# cat count_transfer_irq 24 root@sushi:/sys/kernel/debug/spi-bcm2835-20204000.spi# cat count_transfer_polling 0 root@sushi:/sys/kernel/debug/spi-bcm2835-20204000.spi# cat count_transfer_irq_after_polling 0 so somehow this is not triggering polling transfers anymore, so the maybe-missing completion call would probably not matter. Also, it sets can_dma with a function pointer, but ends up only checking can_dma as a boolean and then calls the function by name rather than using the value stored in can_dma. Again this looks technically correct (and is very much unrelated to my issue) as can_dma does not take any other value and a valid function address would not evaluate as false, but it is surprising to my naive reading. I'll continue poking around later (especially checking computed timeout values), should I submit patches for s/complete/spi_finalize_current_transfer/ ? Regards, -- Vincent Pelletier