Re: CM-ITC, pch_can/c_can_pci, sendto() returning ENOBUFS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jacob,

> Il 20/09/2022 01:24 Jacob Kroon <jacob.kroon@xxxxxxxxx> ha scritto:
> 
>  
> Hi Marc and Dario,
> 
> On 9/16/22 06:14, Jacob Kroon wrote:
> ...> What I do know is that if I revert commit:
> > 
> > "can: c_can: cache frames to operate as a true FIFO"
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=387da6bc7a826cc6d532b1c0002b7c7513238d5f
> > 
> > then everything looks good. I don't get any BUG messages, and the host 
> > has been running overnight without problems, so it seems to have fixed 
> > the network interface lockup as well.

Here's what I think:
If one or more messages are cached, the controller has to transmit more frames 
in the unit of time when they can be transmitted (IF_COMM_TXRQST), different from
when the transmission occurs directly on request from the user space. In the case 
of cached data transmission I therefore think that the controller is more heavily
loaded. Can this shift the balance ?

> 
> I ran the kernel *with* the commit above, and also with the following patch:
> 
> > diff --git a/drivers/net/can/c_can/c_can_main.c b/drivers/net/can/c_can/c_can_main.c
> > index 52671d1ea17d..4375dc70e21f 100644
> > --- a/drivers/net/can/c_can/c_can_main.c
> > +++ b/drivers/net/can/c_can/c_can_main.c
> > @@ -1,3 +1,4 @@
> > +#define DEBUG
> >  /*
> >   * CAN bus driver for Bosch C_CAN controller
> >   *
> > @@ -469,8 +470,15 @@ static netdev_tx_t c_can_start_xmit(struct sk_buff *skb,
> >  	if (c_can_get_tx_free(tx_ring) == 0)
> >  		netif_stop_queue(dev);
> >  
> > -	if (idx < c_can_get_tx_tail(tx_ring))
> > +	netdev_dbg(dev, "JAKR:%d:%d:%d:%d\n", idx,
> > +	                                      c_can_get_tx_head(tx_ring),
> > +	                                      c_can_get_tx_tail(tx_ring),
> > +	                                      c_can_get_tx_free(tx_ring));
> > +
> > +	if (idx < c_can_get_tx_tail(tx_ring)) {
> >  		cmd &= ~IF_COMM_TXRQST; /* Cache the message */
> > +		netdev_dbg(dev, "JAKR:Caching messages\n");
> > +	}
> >  
> >  	/* Store the message in the interface so we can call
> >  	 * can_put_echo_skb(). We must do this before we enable
> 
> and I've uploaded the entire log I could capture from /dev/kmsg, right 
> up to the hang, here:
> 
> https://pastebin.com/6hvAcPc9
> 
> What looks odd to me right from the start is that sometimes when idx 
> rolls over to 0, and *only* when it rolls over to 0, the CAN frame gets 
> cached because "idx < c_can_get_tx_tail(tx_ring)".

If the message were not stored but transmitted, the order of transmission 
would not be respected.

> 
> Is it possible there is some difference between c_can and d_can in how 
> the HW buffers are working, which breaks the driver on my particular HW 
> setup ?
> 

I tested the patch on a beaglebone board without encountering any problems.
There is also a version of the driver I submitted to Xenomai running on a custom
board without problems. But surely the setup and context is different from yours.

What compatible are you using in your device tree?
I used "ti,am3352-d_can".

Thanks and regards,
Dario

> Regards,
> Jacob



[Index of Archives]     [Automotive Discussions]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [CAN Bus]

  Powered by Linux