On 9/23/22 22:27, Jacob Kroon wrote:
Hi Dario,
On 9/23/22 21:45, dariobin@xxxxxxxxx wrote:
Hi Jacob,
Il 23/09/2022 21:21 Jacob Kroon <jacob.kroon@xxxxxxxxx> ha scritto:
On 9/23/22 21:03, Jacob Kroon wrote:
Hi Dario,
On 9/23/22 19:55, dariobin@xxxxxxxxx wrote:
Hi Michael,
Il 23/09/2022 13:36 Marc Kleine-Budde <mkl@xxxxxxxxxxxxxx> ha
scritto:
On 16.09.2022 06:14:58, Jacob Kroon wrote:
What I do know is that if I revert commit:
"can: c_can: cache frames to operate as a true FIFO"
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=387da6bc7a826cc6d532b1c0002b7c7513238d5f
then everything looks good. I don't get any BUG messages, and the
host
has been running overnight without problems, so it seems to have
fixed
the network interface lockup as well.
Jacob, after this mail, I'll send 2 patches. One tries to disable the
cache feature for C_CAN cores, the other shuts a potential race
window.
About the "can: c_can: don't cache TX messages for C_CAN cores" patch:
Could it make sense to change the c_can_start_xmit in this way
- if (idx < c_can_get_tx_tail(tx_ring))
- cmd &= ~IF_COMM_TXRQST; /* Cache the message */
+ if (idx < c_can_get_tx_tail(tx_ring)) {
+ if (priv->type == BOSCH_D_CAN) {
+ cmd &= ~IF_COMM_TXRQST; /* Cache the
message */
+ } else {
+ netif_stop_queue(priv->dev);
+ return NETDEV_TX_BUSY;
+ }
+ }
without changing the c_can_get_tx_{free,busy} routines ?
I can test, so you suggest only doing the above patch, or were there
other parts from Marc's patch you wanted in ?
Well I tried only the above, and the patch below
diff --git a/drivers/net/can/c_can/c_can_main.c
b/drivers/net/can/c_can/c_can_main.c
index a7362af0babb..21d0a55ebbb3 100644
--- a/drivers/net/can/c_can/c_can_main.c
+++ b/drivers/net/can/c_can/c_can_main.c
@@ -468,8 +468,14 @@ static netdev_tx_t c_can_start_xmit(struct sk_buff
*skb,
if (c_can_get_tx_free(tx_ring) == 0)
netif_stop_queue(dev);
- if (idx < c_can_get_tx_tail(tx_ring))
- cmd &= ~IF_COMM_TXRQST; /* Cache the message */
+ if (idx < c_can_get_tx_tail(tx_ring)) {
+ if (priv->type == BOSCH_D_CAN) {
+ cmd &= ~IF_COMM_TXRQST; /* Cache the message */
+ } else {
+ netif_stop_queue(priv->dev);
+ return NETDEV_TX_BUSY;
+ }
+ }
/* Store the message in the interface so we can call
* can_put_echo_skb(). We must do this before we enable
@@ -761,7 +767,7 @@ static void c_can_do_tx(struct net_device *dev)
tail = c_can_get_tx_tail(tx_ring);
- if (tail == 0) {
+ if (priv->type == BOSCH_D_CAN && tail == 0) {
u8 head = c_can_get_tx_head(tx_ring);
/* Start transmission for all cached messages */
but they both seem to lockup the network.
I didn't understand if you applied two patches separately or not.
This was the only patch I had thought of. Which was similar to Marc's
for the interrupt part but differed in the c_can_start_xmit part.
--- a/drivers/net/can/c_can/c_can_main.c
+++ b/drivers/net/can/c_can/c_can_main.c
@@ -464,13 +464,19 @@ static netdev_tx_t c_can_start_xmit(struct
sk_buff *skb,
return NETDEV_TX_BUSY;
idx = c_can_get_tx_head(tx_ring);
+ if (idx < c_can_get_tx_tail(tx_ring)) {
+ if (priv->type == BOSCH_D_CAN) {
+ cmd &= ~IF_COMM_TXRQST; /* Cache the message */
+ } else {
+ netif_stop_queue(priv->dev);
+ return NETDEV_TX_BUSY;
+ }
+ }
+
tx_ring->head++;
if (c_can_get_tx_free(tx_ring) == 0)
netif_stop_queue(dev);
- if (idx < c_can_get_tx_tail(tx_ring))
- cmd &= ~IF_COMM_TXRQST; /* Cache the message */
-
/* Store the message in the interface so we can call
* can_put_echo_skb(). We must do this before we enable
* transmit as we might race against do_tx().
@@ -723,7 +729,6 @@ static void c_can_do_tx(struct net_device *dev)
struct c_can_tx_ring *tx_ring = &priv->tx;
struct net_device_stats *stats = &dev->stats;
u32 idx, obj, pkts = 0, bytes = 0, pend;
- u8 tail;
if (priv->msg_obj_tx_last > 32)
pend = priv->read_reg32(priv, C_CAN_INTPND3_REG);
@@ -759,15 +764,20 @@ static void c_can_do_tx(struct net_device *dev)
stats->tx_bytes += bytes;
stats->tx_packets += pkts;
- tail = c_can_get_tx_tail(tx_ring);
+ if (priv->type == BOSCH_D_CAN) {
+ u8 tail;
+
+ tail = c_can_get_tx_tail(tx_ring);
- if (tail == 0) {
- u8 head = c_can_get_tx_head(tx_ring);
+ if (tail == 0) {
+ u8 head = c_can_get_tx_head(tx_ring);
- /* Start transmission for all cached messages */
- for (idx = tail; idx < head; idx++) {
- obj = idx + priv->msg_obj_tx_first;
- c_can_object_put(dev, IF_NAPI, obj,
IF_COMM_TXRQST);
+ /* Start transmission for all cached messages */
+ for (idx = tail; idx < head; idx++) {
+ obj = idx + priv->msg_obj_tx_first;
+ c_can_object_put(dev, IF_NAPI, obj,
+ IF_COMM_TXRQST);
+ }
}
}
}
I changed it a little from the previous email because I noticed a
problem.
Okay, the patch above seems to be working better. I'm pretty sure
https://marc.info/?l=linux-can&m=166393304023574&w=2
is good, so I'll keep the machine running overnight with Darios patch
above instead.
Machine is still running with CAN network traffic working, so both
patches at
https://marc.info/?l=linux-can&m=166393304023574&w=2
https://marc.info/?l=linux-can&m=166396200108947&w=2
are working for me.
Jacob