On 01.07.2022 09:46:37, Wouter van Herpen wrote: > m_can_tx_handler() initiates a write by calling can_put_echo_skb() and > m_can_write() to M_CAN_TXBAR. > After that, netif_stop_queue is called depending on the FIFO status. > > Observed with a TCAN45 controller and under high CPU load, the TCAN45 > can already generate an interrupt after the m_can_write to M_CAN_TXBAR, > but before netif_stop_queue is executed. > The m_can_isr() is then executed (performing a netif_wake_queue) before > the netif_stop_queue is executed, leading to a blocking socket. > > Fix this for TX FIFO size 1, where the queue can always be stopped > before initiating a transfer. > > Signed-off-by: Wouter van Herpen <wouter.van.herpen@xxxxxxxxxxxxxxxxxxxxxxxxx> > --- > Possible improvement to this patch would be to predict if the next write > would fill the FIFO, and if so, stop the queue before initiating the > transfer. Then the race condition would also be fixed for TX buffer sizes > larger than 1. However, I do not recognize a proper diagnostic register > in the TCAN45 for that purpose. > > Alternatively a locking mechanism could be introduced, which I did not > investigate further as there can be sleeps involved in the SPI writes. Look on how the mcp251xfd does this: https://elixir.bootlin.com/linux/latest/source/drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c#L178 https://elixir.bootlin.com/linux/latest/source/drivers/net/can/spi/mcp251xfd/mcp251xfd-tx.c#L187 https://elixir.bootlin.com/linux/latest/source/drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c#L251 regards, Marc -- Pengutronix e.K. | Marc Kleine-Budde | Embedded Linux | https://www.pengutronix.de | Vertretung West/Dortmund | Phone: +49-231-2826-924 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
Attachment:
signature.asc
Description: PGP signature