Patch "net: don't let netpoll invoke NAPI if in xmit context" has been added to the 5.15-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    net: don't let netpoll invoke NAPI if in xmit context

to the 5.15-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     net-don-t-let-netpoll-invoke-napi-if-in-xmit-context.patch
and it can be found in the queue-5.15 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 404c72ae7e64159dbcf80b5c0423ca2f5e31ee52
Author: Jakub Kicinski <kuba@xxxxxxxxxx>
Date:   Thu Mar 30 19:21:44 2023 -0700

    net: don't let netpoll invoke NAPI if in xmit context
    
    [ Upstream commit 275b471e3d2daf1472ae8fa70dc1b50c9e0b9e75 ]
    
    Commit 0db3dc73f7a3 ("[NETPOLL]: tx lock deadlock fix") narrowed
    down the region under netif_tx_trylock() inside netpoll_send_skb().
    (At that point in time netif_tx_trylock() would lock all queues of
    the device.) Taking the tx lock was problematic because driver's
    cleanup method may take the same lock. So the change made us hold
    the xmit lock only around xmit, and expected the driver to take
    care of locking within ->ndo_poll_controller().
    
    Unfortunately this only works if netpoll isn't itself called with
    the xmit lock already held. Netpoll code is careful and uses
    trylock(). The drivers, however, may be using plain lock().
    Printing while holding the xmit lock is going to result in rare
    deadlocks.
    
    Luckily we record the xmit lock owners, so we can scan all the queues,
    the same way we scan NAPI owners. If any of the xmit locks is held
    by the local CPU we better not attempt any polling.
    
    It would be nice if we could narrow down the check to only the NAPIs
    and the queue we're trying to use. I don't see a way to do that now.
    
    Reported-by: Roman Gushchin <roman.gushchin@xxxxxxxxx>
    Fixes: 0db3dc73f7a3 ("[NETPOLL]: tx lock deadlock fix")
    Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx>
    Reviewed-by: Eric Dumazet <edumazet@xxxxxxxxxx>
    Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index edfc0f8011f88..bd750863959f2 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -137,6 +137,20 @@ static void queue_process(struct work_struct *work)
 	}
 }
 
+static int netif_local_xmit_active(struct net_device *dev)
+{
+	int i;
+
+	for (i = 0; i < dev->num_tx_queues; i++) {
+		struct netdev_queue *txq = netdev_get_tx_queue(dev, i);
+
+		if (READ_ONCE(txq->xmit_lock_owner) == smp_processor_id())
+			return 1;
+	}
+
+	return 0;
+}
+
 static void poll_one_napi(struct napi_struct *napi)
 {
 	int work;
@@ -183,7 +197,10 @@ void netpoll_poll_dev(struct net_device *dev)
 	if (!ni || down_trylock(&ni->dev_lock))
 		return;
 
-	if (!netif_running(dev)) {
+	/* Some drivers will take the same locks in poll and xmit,
+	 * we can't poll if local CPU is already in xmit.
+	 */
+	if (!netif_running(dev) || netif_local_xmit_active(dev)) {
 		up(&ni->dev_lock);
 		return;
 	}



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux