Re: C_CAN/D_CAN bug and fix

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Joe,

Am 21.06.2018 um 11:21 schrieb Joe Burmeister:
> On 21/06/18 10:07, Wolfgang Grandegger wrote:
>> Hello Joe,
>>
>> Am 21.06.2018 um 10:57 schrieb Joe Burmeister:
>>> Hi Wolfgang,
>>>
>>> On 21/06/18 09:25, Wolfgang Grandegger wrote:
>>>> Hello Joe,
>>>>
>>>> Am 21.06.2018 um 09:55 schrieb Joe Burmeister:
>>>>> Hi Wolfgang
>>>>>
>>>>>
>>>>> On 21/06/18 08:24, Wolfgang Grandegger wrote:
>>>>>> Hello Joe,
>>>>>>
>>>>>> I have some more questions...
>>>>>>
>>>>>> Am 20.06.2018 um 19:00 schrieb Joe Burmeister:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I've bumped into what I think is a chip bug that the C_CAN/D_CAN driver
>>>>>>> isn't handling.
>>>>>>>
>>>>>>> It can get into a state where the chip status register reports it's bus
>>>>>>> off, but the can driver doesn't know, so the bus never gets restarted.
>>>>>>>
>>>>>>> Looks like the chip isn't firing the interrupt or is firing with the
>>>>>>> interrupt register as zero. Either is wrong and means "c_can_poll" is
>>>>>>> never called, and thus the driver never picks up the bus off.
>>>>>>>
>>>>>>> We are turning on/off the can device we are talking to, and we have to
>>>>>>> do this a lot to cause this. But we can get into this state and then the
>>>>>> With on/off you mean "ifconfig up/down"?
>>>>> No, literally power on and power off to the device we are talking to
>>>>> over can.
>>>>> It's power is controlled by a GPIO line on the BBB and part of the
>>>>> normal operation is to turn it on and off.
>>>>> But in the test, we do that a lot to reproduce this bug we only saw once
>>>>> in a blue moon.
>>>>>
>>>>>> Is it always the first bus-off making trouble after you switched on the
>>>>>> device?
>>>>> No, even in the test, most of the time, the test iteration completes
>>>>> without issue.
>>>>>
>>>>>> Does the "bus-off" condition occur frequently?
>>>>> Even with the test, which an iteration lasts about 30 seconds, it can
>>>>> take over 5 minutes.
>>>> I mean: do bus-off conditions occur frequently on the bus? At what rate?
>>> It's when the power is going on or off to the device. We have some
>>> contactors to some big power that probably introduces a fair amount of
>>> noise on connect/disconnect causing can errors. The device we are
>>> talking to gets power from the same circuit. Though it's fine once up,
>>> it is born and dies in a hell fire of noise. But CAN should be ok with that.
>> So you have a bus error storm when the device is switched on (and off).
>> I suspect that the problem is while initializing the CAN device.
>>
>> [...snip...]
> 
> The device is designed for this harsh environment (not by us), though it
> too is still in development.
> CAN should be pretty fault tolerant and it's the c_can/d_can driver that
> has the issue. It's out of sync with it's chip's status.
> When I get time on the setup, I'll see if that interrupt change stops
> that happening.

I didn't say that it's your fault ;). I just want to understand what
could cause the problem! I don't think it's hardware, either.

Wolfgang.
--
To unsubscribe from this list: send the line "unsubscribe linux-can" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Automotive Discussions]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [CAN Bus]

  Powered by Linux