Patch "net/mlx5: Fix SF health recovery flow" has been added to the 5.15-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    net/mlx5: Fix SF health recovery flow

to the 5.15-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     net-mlx5-fix-sf-health-recovery-flow.patch
and it can be found in the queue-5.15 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 663b1c674df8f0094d7b23fbd8980d509f3da3a6
Author: Moshe Shemesh <moshe@xxxxxxxxxx>
Date:   Tue Nov 23 20:08:13 2021 +0200

    net/mlx5: Fix SF health recovery flow
    
    [ Upstream commit 33de865f7bce3968676e43b0182af0a2dd359dae ]
    
    SF do not directly control the PCI device. During recovery flow SF
    should not be allowed to do pci disable or pci reset, its PF will do it.
    
    It fixes the following kernel trace:
    mlx5_core.sf mlx5_core.sf.25: mlx5_health_try_recover:387:(pid 40948): starting health recovery flow
    mlx5_core 0000:03:00.0: mlx5_pci_slot_reset was called
    mlx5_core 0000:03:00.0: wait vital counter value 0xab175 after 1 iterations
    mlx5_core.sf mlx5_core.sf.25: firmware version: 24.32.532
    mlx5_core.sf mlx5_core.sf.23: mlx5_health_try_recover:387:(pid 40946): starting health recovery flow
    mlx5_core 0000:03:00.0: mlx5_pci_slot_reset was called
    mlx5_core 0000:03:00.0: wait vital counter value 0xab193 after 1 iterations
    mlx5_core.sf mlx5_core.sf.23: firmware version: 24.32.532
    mlx5_core.sf mlx5_core.sf.25: mlx5_cmd_check:813:(pid 40948): ENABLE_HCA(0x104) op_mod(0x0) failed,
    status bad resource state(0x9), syndrome (0x658908)
    mlx5_core.sf mlx5_core.sf.25: mlx5_function_setup:1292:(pid 40948): enable hca failed
    mlx5_core.sf mlx5_core.sf.25: mlx5_health_try_recover:389:(pid 40948): health recovery failed
    
    Fixes: 1958fc2f0712 ("net/mlx5: SF, Add auxiliary device driver")
    Signed-off-by: Moshe Shemesh <moshe@xxxxxxxxxx>
    Signed-off-by: Saeed Mahameed <saeedm@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 92b08fa07efae..92b01858d7f3e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1775,12 +1775,13 @@ void mlx5_disable_device(struct mlx5_core_dev *dev)
 
 int mlx5_recover_device(struct mlx5_core_dev *dev)
 {
-	int ret = -EIO;
+	if (!mlx5_core_is_sf(dev)) {
+		mlx5_pci_disable_device(dev);
+		if (mlx5_pci_slot_reset(dev->pdev) != PCI_ERS_RESULT_RECOVERED)
+			return -EIO;
+	}
 
-	mlx5_pci_disable_device(dev);
-	if (mlx5_pci_slot_reset(dev->pdev) == PCI_ERS_RESULT_RECOVERED)
-		ret = mlx5_load_one(dev);
-	return ret;
+	return mlx5_load_one(dev);
 }
 
 static struct pci_driver mlx5_core_driver = {



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux