Re: [PATCH 5/5 v2] can: xilinx_can: fix TX FIFO accounting getting out of sync

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02/21/2017 05:07 PM, Anssi Hannula wrote:
> On 21.2.2017 17:14, Marc Kleine-Budde wrote:
>> On 02/21/2017 03:57 PM, Anssi Hannula wrote:
>>> On 20.2.2017 12:27, Anssi Hannula wrote:
>>>> On 20.2.2017 11:42, Marc Kleine-Budde wrote:
>>>>> On 02/13/2017 04:39 PM, Anssi Hannula wrote:
>>>>>> The xilinx_can driver assumes that the TXOK interrupt only clears after
>>>>>> it has been acknowledged as many times as there have been successfully
>>>>>> sent frames.
>>>>>>
>>>>>> However, the documentation does not mention such behavior, instead
>>>>>> saying just that the interrupt is cleared when the clear bit is set.
>>>>>>
>>>>>> Similarly, testing seems to also suggest that it is immediately cleared
>>>>>> regardless of the amount of frames having been sent. Performing some
>>>>>> heavy TX load and then going back to idle has the tx_head drifting
>>>>>> further away from tx_tail over time, steadily reducing the amount of
>>>>>> frames the driver keeps in the TX FIFO (but not to zero, as the TXOK
>>>>>> interrupt always frees up space for 1 frame from the driver's
>>>>>> perspective, so frames continue to be sent) and delaying the local echo
>>>>>> frames.
>>>>>>
>>>>>> The TX FIFO tracking is also otherwise buggy as it does not account for
>>>>>> TX FIFO being cleared after software resets, causing
>>>>>>   BUG!, TX FIFO full when queue awake!
>>>>>> messages to be output.
>>>>>>
>>>>>> Since there does not seem to be any way to accurately track the state of
>>>>>> the TX FIFO, drop the code trying to do that and perform
>>>>>> netif_stop_queue() based on whether the HW reports the FIFO being full.
>>>>>> Driver-side local echo support is also removed as it is not possible to
>>>>>> do accurately (relying instead on the CAN core fallback support).
>>>>>>
>>>>>> Tested with the integrated CAN on Zynq-7000 SoC.
>>>>>>
>>>>>> v2: Add FIFO space check before TX queue wake with locking to
>>>>>> synchronize with queue stop. This avoids waking the queue when xmit()
>>>>>> had just filled it.
>>>>> Removing the echo on TX-complete should be avoided if possible. Is there
>>>>> a way to get a indication that an individual package has been send?
>>>> Looking at the reference manual [1, chapter 18 and appendix B.5] I can't
>>>> see any. The TX FIFO interrupts and status registers available are TXOK
>>>> (1 or more messages sent), FIFO empty, FIFO full, FIFO under watermark
>>>> (watermark not programmable while on-bus). There is no FIFO free space
>>>> counter that I can see.
>>>>
>>>> One way would be to only put a single frame into the FIFO at a time, of
>>>> course, but at least to me that seems worse than losing echo-on-TX-complete.
>>> A correction to my last statement above, we could probably put up to 3
>>> frames in TX FIFO at a time and still keep track of them using the TX
>>> FIFO watermark, allowing echo-on-TX-complete to still be supported, with
>>> something like this:
>>> if (tx fifo empty)
>>>    all frames sent;
>>> else if (tx fifo under watermark 2)
>>>    all except 1 frame sent;
>>> else if (txok)
>>>   1 frame sent; (all except 2)
>>>
>>> Though 3 is still much less than 64, the current TX FIFO max size.
>> Sounds good, can you work on a patch?
> 
> Unfortunately I've just noticed that the soft-IP version of the
> controller (compatible-string "xlnx,axi-can-1.00.a" instead of
> "xlnx,zynq-can-1.0" found on Zynq SoCs) does not have the TX FIFO empty
> bit nor the watermark programming feature:
> https://www.xilinx.com/support/documentation/ip_documentation/can/v5_0/pg096-can.pdf
> 
> So the above idea for tracking more than 1 frame in FIFO would not work
> on those.
> 
> What do you think should be done here?

Depending on the compatible string we know that max usable FIFO size.
You can upgrade the driver to cope with this.

The xmit function doesn't change at all:

> 	/* Check if the TX buffer is full */
> 	if ((priv->tx_head - priv->tx_tail) == priv->tx_max)
> 		netif_stop_queue(ndev);

Just the calculation of priv->tx_max changes depending on the compatible.

The loop in xcan_tx_interrupt() get more complicated.

If this works, we can discuss on a property (maybe DT) to switch the
driver into "non"-IFF_ECHO mode, as you proposed in your original patch.

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]