This is a note to let you know that I've just added the patch titled net/mlx5: Stop waiting for PCI if pci channel is offline to the 6.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: net-mlx5-stop-waiting-for-pci-if-pci-channel-is-offl.patch and it can be found in the queue-6.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. commit 4bfecfe30b552d640ec75dc5e5ad070a35ea545e Author: Moshe Shemesh <moshe@xxxxxxxxxx> Date: Tue Jun 4 00:04:42 2024 +0300 net/mlx5: Stop waiting for PCI if pci channel is offline [ Upstream commit 33afbfcc105a572159750f2ebee834a8a70fdd96 ] In case pci channel becomes offline the driver should not wait for PCI reads during health dump and recovery flow. The driver has timeout for each of these loops trying to read PCI, so it would fail anyway. However, in case of recovery waiting till timeout may cause the pci error_detected() callback fail to meet pci_dpc_recovered() wait timeout. Fixes: b3bd076f7501 ("net/mlx5: Report devlink health on FW fatal issues") Signed-off-by: Moshe Shemesh <moshe@xxxxxxxxxx> Reviewed-by: Shay Drori <shayd@xxxxxxxxxx> Signed-off-by: Tariq Toukan <tariqt@xxxxxxxxxx> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw.c b/drivers/net/ethernet/mellanox/mlx5/core/fw.c index e7faf7e73ca48..6c7f2471fe629 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw.c @@ -373,6 +373,10 @@ int mlx5_cmd_fast_teardown_hca(struct mlx5_core_dev *dev) do { if (mlx5_get_nic_state(dev) == MLX5_INITIAL_SEG_NIC_INTERFACE_DISABLED) break; + if (pci_channel_offline(dev->pdev)) { + mlx5_core_err(dev, "PCI channel offline, stop waiting for NIC IFC\n"); + return -EACCES; + } cond_resched(); } while (!time_after(jiffies, end)); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c index ad38e31822df1..a6329ca2d9bff 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/health.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c @@ -248,6 +248,10 @@ void mlx5_error_sw_reset(struct mlx5_core_dev *dev) do { if (mlx5_get_nic_state(dev) == MLX5_INITIAL_SEG_NIC_INTERFACE_DISABLED) break; + if (pci_channel_offline(dev->pdev)) { + mlx5_core_err(dev, "PCI channel offline, stop waiting for NIC IFC\n"); + goto unlock; + } msleep(20); } while (!time_after(jiffies, end)); @@ -317,6 +321,10 @@ int mlx5_health_wait_pci_up(struct mlx5_core_dev *dev) mlx5_core_warn(dev, "device is being removed, stop waiting for PCI\n"); return -ENODEV; } + if (pci_channel_offline(dev->pdev)) { + mlx5_core_err(dev, "PCI channel offline, stop waiting for PCI\n"); + return -EACCES; + } msleep(100); } return 0; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c index 6b774e0c27665..d0b595ba61101 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c @@ -74,6 +74,10 @@ int mlx5_vsc_gw_lock(struct mlx5_core_dev *dev) ret = -EBUSY; goto pci_unlock; } + if (pci_channel_offline(dev->pdev)) { + ret = -EACCES; + goto pci_unlock; + } /* Check if semaphore is already locked */ ret = vsc_read(dev, VSC_SEMAPHORE_OFFSET, &lock_val);