Re: C_CAN/D_CAN bug and fix

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Joe,

I have some more questions...

Am 20.06.2018 um 19:00 schrieb Joe Burmeister:
> Hi,
> 
> I've bumped into what I think is a chip bug that the C_CAN/D_CAN driver
> isn't handling.
> 
> It can get into a state where the chip status register reports it's bus
> off, but the can driver doesn't know, so the bus never gets restarted.
> 
> Looks like the chip isn't firing the interrupt or is firing with the
> interrupt register as zero. Either is wrong and means "c_can_poll" is
> never called, and thus the driver never picks up the bus off.
> 
> We are turning on/off the can device we are talking to, and we have to
> do this a lot to cause this. But we can get into this state and then the

With on/off you mean "ifconfig up/down"?
Is it always the first bus-off making trouble after you switched on the
device?
Does the "bus-off" condition occur frequently?
May bus-off also occur during the start of the CAN device (ifconfig up)?

> manual fix is to do "ifdown can0 && ifup can0" to sync up the driver and
> the chip. If you don't everything looks fine but nothing you send goes
> out to the bus and you never receive anything.
> 
> When this issue bites, the last messages you see in candump are:
> 
> can0  20000004   [8]  00 04 00 00 00 00 00 79   ERRORFRAME
> can0  20000004   [8]  00 10 00 00 00 00 00 79   ERRORFRAME
> 
> You see this in candump on other iterations of the test, but often see
> the following :
> 
> can0  20000040   [8]  00 00 00 00 00 00 00 00   ERRORFRAME
> can0  20000100   [8]  00 00 00 00 00 00 00 00   ERRORFRAME
> 
> You obviously see a "c_can_platform 481cc000.can can0: bus-off" and
> "c_can_platform 481cc000.can can0: restarted" in dmesg with the above
> can messages. As I understand it, it's the BBB end that is sending these
> two. When you don't see these two following, there isn't a (lasting
> anyway) detected bus off, so the traffic between the device and the BBB
> starts as normal when power comes on.
> 
> What I've done is catch the bus off in "c_can_start_xmit" on a
> "can_send" and if it is an unknown bus off, schedule "c_can_poll" which
> will do what is required. So it self fixes.
> 
> I figured even if it's something odd about the device we are talking to
> causing this, it shouldn't be able to get into this state.
> 
> This was on 4.4 but I see that 4.18 is basically the same code.
> 
> Anyway, this is what we are doing and now I've done due diligence
> passing the information on. :-)
> 
> Patch attached.

OK, an extra napi_schedule() finds the bus-off then. Needs more thoughts...

> 
> Regards,
> 
> Joe
> 
> P.S. Don't know if
> "http://www.keil.com/dd/docs/datashts/silabs/boschcan_ug.pdf"; is an
> acceptable link for the datasheet, but the URL for the datasheet in the
> code is 404'ed.

I also realized some time ago that the link is broken :(. Your link to
Keil looks good, though.

Wolfgang.

--
To unsubscribe from this list: send the line "unsubscribe linux-can" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Automotive Discussions]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [CAN Bus]

  Powered by Linux