Patch "openvswitch: fix lockup on tx to unregistering netdev with carrier" has been added to the 6.6-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    openvswitch: fix lockup on tx to unregistering netdev with carrier

to the 6.6-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     openvswitch-fix-lockup-on-tx-to-unregistering-netdev.patch
and it can be found in the queue-6.6 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 366becb8a9e98772d634248f1ca03b5ff7ffdb8d
Author: Ilya Maximets <i.maximets@xxxxxxx>
Date:   Thu Jan 9 13:21:24 2025 +0100

    openvswitch: fix lockup on tx to unregistering netdev with carrier
    
    [ Upstream commit 47e55e4b410f7d552e43011baa5be1aab4093990 ]
    
    Commit in a fixes tag attempted to fix the issue in the following
    sequence of calls:
    
        do_output
        -> ovs_vport_send
           -> dev_queue_xmit
              -> __dev_queue_xmit
                 -> netdev_core_pick_tx
                    -> skb_tx_hash
    
    When device is unregistering, the 'dev->real_num_tx_queues' goes to
    zero and the 'while (unlikely(hash >= qcount))' loop inside the
    'skb_tx_hash' becomes infinite, locking up the core forever.
    
    But unfortunately, checking just the carrier status is not enough to
    fix the issue, because some devices may still be in unregistering
    state while reporting carrier status OK.
    
    One example of such device is a net/dummy.  It sets carrier ON
    on start, but it doesn't implement .ndo_stop to set the carrier off.
    And it makes sense, because dummy doesn't really have a carrier.
    Therefore, while this device is unregistering, it's still easy to hit
    the infinite loop in the skb_tx_hash() from the OVS datapath.  There
    might be other drivers that do the same, but dummy by itself is
    important for the OVS ecosystem, because it is frequently used as a
    packet sink for tcpdump while debugging OVS deployments.  And when the
    issue is hit, the only way to recover is to reboot.
    
    Fix that by also checking if the device is running.  The running
    state is handled by the net core during unregistering, so it covers
    unregistering case better, and we don't really need to send packets
    to devices that are not running anyway.
    
    While only checking the running state might be enough, the carrier
    check is preserved.  The running and the carrier states seem disjoined
    throughout the code and different drivers.  And other core functions
    like __dev_direct_xmit() check both before attempting to transmit
    a packet.  So, it seems safer to check both flags in OVS as well.
    
    Fixes: 066b86787fa3 ("net: openvswitch: fix race on port output")
    Reported-by: Friedrich Weber <f.weber@xxxxxxxxxxx>
    Closes: https://mail.openvswitch.org/pipermail/ovs-discuss/2025-January/053423.html
    Signed-off-by: Ilya Maximets <i.maximets@xxxxxxx>
    Tested-by: Friedrich Weber <f.weber@xxxxxxxxxxx>
    Reviewed-by: Aaron Conole <aconole@xxxxxxxxxx>
    Link: https://patch.msgid.link/20250109122225.4034688-1-i.maximets@xxxxxxx
    Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 4f5cbcaa38386..9445ca97163b4 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -918,7 +918,9 @@ static void do_output(struct datapath *dp, struct sk_buff *skb, int out_port,
 {
 	struct vport *vport = ovs_vport_rcu(dp, out_port);
 
-	if (likely(vport && netif_carrier_ok(vport->dev))) {
+	if (likely(vport &&
+		   netif_running(vport->dev) &&
+		   netif_carrier_ok(vport->dev))) {
 		u16 mru = OVS_CB(skb)->mru;
 		u32 cutlen = OVS_CB(skb)->cutlen;
 




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux