Patch "ibmvnic: Do not reset dql stats on NON_FATAL err" has been added to the 5.15-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    ibmvnic: Do not reset dql stats on NON_FATAL err

to the 5.15-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     ibmvnic-do-not-reset-dql-stats-on-non_fatal-err.patch
and it can be found in the queue-5.15 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 34e66c39add5b4fddd3244731c5e62488a1488d6
Author: Nick Child <nnac123@xxxxxxxxxxxxx>
Date:   Wed Jun 28 13:22:44 2023 -0500

    ibmvnic: Do not reset dql stats on NON_FATAL err
    
    [ Upstream commit 48538ccb825b05544ec308a509e2cc9c013402db ]
    
    All ibmvnic resets, make a call to netdev_tx_reset_queue() when
    re-opening the device. netdev_tx_reset_queue() resets the num_queued
    and num_completed byte counters. These stats are used in Byte Queue
    Limit (BQL) algorithms. The difference between these two stats tracks
    the number of bytes currently sitting on the physical NIC. ibmvnic
    increases the number of queued bytes though calls to
    netdev_tx_sent_queue() in the drivers xmit function. When, VIOS reports
    that it is done transmitting bytes, the ibmvnic device increases the
    number of completed bytes through calls to netdev_tx_completed_queue().
    It is important to note that the driver batches its transmit calls and
    num_queued is increased every time that an skb is added to the next
    batch, not necessarily when the batch is sent to VIOS for transmission.
    
    Unlike other reset types, a NON FATAL reset will not flush the sub crq
    tx buffers. Therefore, it is possible for the batched skb array to be
    partially full. So if there is call to netdev_tx_reset_queue() when
    re-opening the device, the value of num_queued (0) would not account
    for the skb's that are currently batched. Eventually, when the batch
    is sent to VIOS, the call to netdev_tx_completed_queue() would increase
    num_completed to a value greater than the num_queued. This causes a
    BUG_ON crash:
    
    ibmvnic 30000002: Firmware reports error, cause: adapter problem.
    Starting recovery...
    ibmvnic 30000002: tx error 600
    ibmvnic 30000002: tx error 600
    ibmvnic 30000002: tx error 600
    ibmvnic 30000002: tx error 600
    ------------[ cut here ]------------
    kernel BUG at lib/dynamic_queue_limits.c:27!
    Oops: Exception in kernel mode, sig: 5
    [....]
    NIP dql_completed+0x28/0x1c0
    LR ibmvnic_complete_tx.isra.0+0x23c/0x420 [ibmvnic]
    Call Trace:
    ibmvnic_complete_tx.isra.0+0x3f8/0x420 [ibmvnic] (unreliable)
    ibmvnic_interrupt_tx+0x40/0x70 [ibmvnic]
    __handle_irq_event_percpu+0x98/0x270
    ---[ end trace ]---
    
    Therefore, do not reset the dql stats when performing a NON_FATAL reset.
    
    Fixes: 0d973388185d ("ibmvnic: Introduce xmit_more support using batched subCRQ hcalls")
    Signed-off-by: Nick Child <nnac123@xxxxxxxxxxxxx>
    Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 8a92c6a6e764f..765dee2e4882e 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1240,7 +1240,14 @@ static int __ibmvnic_open(struct net_device *netdev)
 		if (prev_state == VNIC_CLOSED)
 			enable_irq(adapter->tx_scrq[i]->irq);
 		enable_scrq_irq(adapter, adapter->tx_scrq[i]);
-		netdev_tx_reset_queue(netdev_get_tx_queue(netdev, i));
+		/* netdev_tx_reset_queue will reset dql stats. During NON_FATAL
+		 * resets, don't reset the stats because there could be batched
+		 * skb's waiting to be sent. If we reset dql stats, we risk
+		 * num_completed being greater than num_queued. This will cause
+		 * a BUG_ON in dql_completed().
+		 */
+		if (adapter->reset_reason != VNIC_RESET_NON_FATAL)
+			netdev_tx_reset_queue(netdev_get_tx_queue(netdev, i));
 	}
 
 	rc = set_link_state(adapter, IBMVNIC_LOGICAL_LNK_UP);



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux