Re: 5.11.0-rc1+: "Division by zero in kernel." when writing to spidev

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 07, 2021 at 09:57:01AM +0900, Vincent Pelletier wrote:

> I've started reading spi-bcm2835.c, and while I cannot claim that I understand
> everything I'm reading, it raises some flags:

Copying in a bunch of people for that driver.

> - it does not use "spi_finalize_current_transfer(...)" at all, but rather
>   "complete(&...->xfer_completion)". The former only calls the latter,
>   so this code seems technically correct, but this looks like an
>   abstraction layer bust.

Yes.

> - while it uses "complete(...)" on its IRQ and DMA transfer codepaths,
>   I do not see it being called on its polled codepath
>   (bcm2835_spi_transfer_one_poll).

I'd not expect this to be required with a polled path, it's only needed
if the transfer function returns a positive value indicating that the
transfer is still in progress which shouldn't be the case when things
are polled.

> - ...polled codepath which checks bs->rx_len to tell when it's done,
>   independently of transfer direction. And while tx_len and rx_len are
> initialised
>   to the same value, only the field actually corresponding to the
> actual transfer
>   direction will be updated within this polling loop.
>   So maybe some transfers could timeout in the polled codepath and would
>   still end up in the IRQ one which would end up calling "complete", but this
>   looks suspicious.
> 
> Checking on 5.10.4, I see:
> root@sushi:/sys/kernel/debug/spi-bcm2835-20204000.spi# cat count_transfer_dma
> 2
> root@sushi:/sys/kernel/debug/spi-bcm2835-20204000.spi# cat count_transfer_irq
> 1
> root@sushi:/sys/kernel/debug/spi-bcm2835-20204000.spi# cat
> count_transfer_polling
> 27
> root@sushi:/sys/kernel/debug/spi-bcm2835-20204000.spi# cat
> count_transfer_irq_after_polling
> 0
> 
> so I am apparently not triggering the poll-then-IRQ case, but am
> triggering the others
> on this kernel version.
> 
> On 5.11, this becomes:
> root@sushi:/sys/kernel/debug/spi-bcm2835-20204000.spi# cat count_transfer_dma
> 2
> root@sushi:/sys/kernel/debug/spi-bcm2835-20204000.spi# cat count_transfer_irq
> 24
> root@sushi:/sys/kernel/debug/spi-bcm2835-20204000.spi# cat
> count_transfer_polling
> 0
> root@sushi:/sys/kernel/debug/spi-bcm2835-20204000.spi# cat
> count_transfer_irq_after_polling
> 0
> 
> so somehow this is not triggering polling transfers anymore, so the
> maybe-missing
> completion call would probably not matter.
> 
> Also, it sets can_dma with a function pointer, but ends up only
> checking can_dma as
> a boolean and then calls the function by name rather than using the
> value stored in
> can_dma. Again this looks technically correct (and is very much
> unrelated to my issue)
> as can_dma does not take any other value and a valid function address would not
> evaluate as false, but it is surprising to my naive reading.
> 
> I'll continue poking around later (especially checking computed timeout values),
> should I submit patches for s/complete/spi_finalize_current_transfer/ ?

Probably send the completion patch, yes.

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux