[PATCH] can: m_can: fix netif_stop/wake_queue race condition between m_can_tx_handler() and m_can_isr().

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



m_can_tx_handler() initiates a write by calling can_put_echo_skb() and
m_can_write() to M_CAN_TXBAR.
After that, netif_stop_queue is called depending on the FIFO status.

Observed with a TCAN45 controller and under high CPU load, the TCAN45
can already generate an interrupt after the m_can_write to M_CAN_TXBAR,
but before netif_stop_queue is executed.
The m_can_isr() is then executed (performing a netif_wake_queue) before
the netif_stop_queue is executed, leading to a blocking socket.

Fix this for TX FIFO size 1, where the queue can always be stopped
before initiating a transfer.

Signed-off-by: Wouter van Herpen <wouter.van.herpen@xxxxxxxxxxxxxxxxxxxxxxxxx>
---
Possible improvement to this patch would be to predict if the next write
would fill the FIFO, and if so, stop the queue before initiating the
transfer. Then the race condition would also be fixed for TX buffer sizes
larger than 1. However, I do not recognize a proper diagnostic register
in the TCAN45 for that purpose.

Alternatively a locking mechanism could be introduced, which I did not
investigate further as there can be sleeps involved in the SPI writes.

 drivers/net/can/m_can/m_can.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index 5d0c82d8b9a9..2817f8e83206 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -1684,6 +1684,15 @@ static netdev_tx_t m_can_tx_handler(struct m_can_classdev *cdev)
 		if (err)
 			goto out_fail;
 
+		/* A race can occur between netif_stop_queue here and
+		 * netif_wake_queue in m_can_isr(), if the queue is stopped
+		 * after initiating a transfer. If the TX buffer has size 1,
+		 * it is allowed to always stop the queue and only then
+		 * initiate the transfer, thus avoiding any race condition.
+		 */
+		if (cdev->mcfg[MRAM_TXB].num == 1)
+			netif_stop_queue(dev);
+
 		/* Push loopback echo.
 		 * Will be looped back on TX interrupt based on message marker
 		 */
@@ -1692,10 +1701,12 @@ static netdev_tx_t m_can_tx_handler(struct m_can_classdev *cdev)
 		/* Enable TX FIFO element to start transfer  */
 		m_can_write(cdev, M_CAN_TXBAR, (1 << putidx));
 
-		/* stop network queue if fifo full */
-		if (m_can_tx_fifo_full(cdev) ||
-		    m_can_next_echo_skb_occupied(dev, putidx))
-			netif_stop_queue(dev);
+		if (cdev->mcfg[MRAM_TXB].num != 1) {
+			/* stop network queue if fifo full */
+			if (m_can_tx_fifo_full(cdev) ||
+			    m_can_next_echo_skb_occupied(dev, putidx))
+				netif_stop_queue(dev);
+		}
 	}
 
 	return NETDEV_TX_OK;
-- 
2.30.2




[Index of Archives]     [Automotive Discussions]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [CAN Bus]

  Powered by Linux