Patch "can: dev: can_restart(): fix race condition between controller restart and netif_carrier_on()" has been added to the 6.5-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    can: dev: can_restart(): fix race condition between controller restart and netif_carrier_on()

to the 6.5-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     can-dev-can_restart-fix-race-condition-between-contr.patch
and it can be found in the queue-6.5 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 7b05c6996f811807282efcbf34b07ea348620678
Author: Marc Kleine-Budde <mkl@xxxxxxxxxxxxxx>
Date:   Fri Sep 29 10:25:11 2023 +0200

    can: dev: can_restart(): fix race condition between controller restart and netif_carrier_on()
    
    [ Upstream commit 6841cab8c4504835e4011689cbdb3351dec693fd ]
    
    This race condition was discovered while updating the at91_can driver
    to use can_bus_off(). The following scenario describes how the
    converted at91_can driver would behave.
    
    When a CAN device goes into BUS-OFF state, the driver usually
    stops/resets the CAN device and calls can_bus_off().
    
    This function sets the netif carrier to off, and (if configured by
    user space) schedules a delayed work that calls can_restart() to
    restart the CAN device.
    
    The can_restart() function first checks if the carrier is off and
    triggers an error message if the carrier is OK.
    
    Then it calls the driver's do_set_mode() function to restart the
    device, then it sets the netif carrier to on. There is a race window
    between these two calls.
    
    The at91 CAN controller (observed on the sama5d3, a single core 32 bit
    ARM CPU) has a hardware limitation. If the device goes into bus-off
    while sending a CAN frame, there is no way to abort the sending of
    this frame. After the controller is enabled again, another attempt is
    made to send it.
    
    If the bus is still faulty, the device immediately goes back to the
    bus-off state. The driver calls can_bus_off(), the netif carrier is
    switched off and another can_restart is scheduled. This occurs within
    the race window before the original can_restart() handler marks the
    netif carrier as OK. This would cause the 2nd can_restart() to be
    called with an OK netif carrier, resulting in an error message.
    
    The flow of the 1st can_restart() looks like this:
    
    can_restart()
        // bail out if netif_carrier is OK
    
        netif_carrier_ok(dev)
        priv->do_set_mode(dev, CAN_MODE_START)
            // enable CAN controller
            // sama5d3 restarts sending old message
    
            // CAN devices goes into BUS_OFF, triggers IRQ
    
    // IRQ handler start
        at91_irq()
            at91_irq_err_line()
                can_bus_off()
                    netif_carrier_off()
                    schedule_delayed_work()
    // IRQ handler end
    
        netif_carrier_on()
    
    The 2nd can_restart() will be called with an OK netif carrier and the
    error message will be printed.
    
    To close the race window, first set the netif carrier to on, then
    restart the controller. In case the restart fails with an error code,
    roll back the netif carrier to off.
    
    Fixes: 39549eef3587 ("can: CAN Network device driver and Netlink interface")
    Link: https://lore.kernel.org/all/20231005-can-dev-fix-can-restart-v2-2-91b5c1fd922c@xxxxxxxxxxxxxx
    Reviewed-by: Vincent Mailhol <mailhol.vincent@xxxxxxxxxx>
    Signed-off-by: Marc Kleine-Budde <mkl@xxxxxxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/net/can/dev/dev.c b/drivers/net/can/dev/dev.c
index a5bbdfa9a2693..735d5de3caa0e 100644
--- a/drivers/net/can/dev/dev.c
+++ b/drivers/net/can/dev/dev.c
@@ -154,11 +154,12 @@ static void can_restart(struct net_device *dev)
 	priv->can_stats.restarts++;
 
 	/* Now restart the device */
-	err = priv->do_set_mode(dev, CAN_MODE_START);
-
 	netif_carrier_on(dev);
-	if (err)
+	err = priv->do_set_mode(dev, CAN_MODE_START);
+	if (err) {
 		netdev_err(dev, "Error %d during restart", err);
+		netif_carrier_off(dev);
+	}
 }
 
 static void can_restart_work(struct work_struct *work)



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux