On 11.01.2023 09:17:25, Marc Kleine-Budde wrote: > On 10.01.2023 23:43:02, Marc Kleine-Budde wrote: > > On 10.01.2023 23:37:44, Marc Kleine-Budde wrote: > > > On 10.01.2023 22:50:33, Marc Kleine-Budde wrote: > > > > On 10.01.2023 21:40:16, Thomas.Kopp@xxxxxxxxxxxxx wrote: > > > > > > The correct message counter is 0x100, the wrong one 0x120. That's 2x > > > > > > FIFO size. I'd like to know when the FIFO head is wrong for the first > > > > > > time, one that results in a data transfer where "old" FIFO contents is > > > > > > read. I haven't dumped any data yet. > > > > > > > > I got a chip-delta == 4 error. > > > > > > I have a proof of concept workaround implemented and I'll let it run > > > over night. > > > > \o/ The workaround triggered and Stefan's test program continued without > > problems \o/ > > Still running. The workaround triggered more than 45x this night. I think the same problem occurs with the TEF, too. It's easier to detect, as the TEF has sequence numbers. The driver already implements a workaround. It limits the TEF head to TX head [1] and it refuses to handle TEF objects with wrong sequence numbers [2] and UINCs only the correctly handled ones. However it doesn't roll back the "wrong" internal TEF head. The problem occurs (on Linux) only during high TX load situations, so eventually there will the enough finished TX frames and the internal TEF head will be correct again. regards, Marc [1] https://elixir.bootlin.com/linux/v6.1.4/source/drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c#L141 [2] https://elixir.bootlin.com/linux/v6.1.4/source/drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c#L106 -- Pengutronix e.K. | Marc Kleine-Budde | Embedded Linux | https://www.pengutronix.de | Vertretung West/Dortmund | Phone: +49-231-2826-924 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
Attachment:
signature.asc
Description: PGP signature