This is a note to let you know that I've just added the patch titled net/mlx5: Always drain health in shutdown callback to the 6.10-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: net-mlx5-always-drain-health-in-shutdown-callback.patch and it can be found in the queue-6.10 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. commit 3d8d31da5e3cec6406ed8a567a7e5a44836d2a91 Author: Shay Drory <shayd@xxxxxxxxxx> Date: Tue Jul 30 09:16:30 2024 +0300 net/mlx5: Always drain health in shutdown callback [ Upstream commit 1b75da22ed1e6171e261bc9265370162553d5393 ] There is no point in recovery during device shutdown. if health work started need to wait for it to avoid races and NULL pointer access. Hence, drain health WQ on shutdown callback. Fixes: 1958fc2f0712 ("net/mlx5: SF, Add auxiliary device driver") Fixes: d2aa060d40fa ("net/mlx5: Cancel health poll before sending panic teardown command") Signed-off-by: Shay Drory <shayd@xxxxxxxxxx> Reviewed-by: Moshe Shemesh <moshe@xxxxxxxxxx> Signed-off-by: Tariq Toukan <tariqt@xxxxxxxxxx> Reviewed-by: Wojciech Drewek <wojciech.drewek@xxxxxxxxx> Link: https://patch.msgid.link/20240730061638.1831002-2-tariqt@xxxxxxxxxx Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index 459a836a5d9c1..3e55a6c6a7c9b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -2140,7 +2140,6 @@ static int mlx5_try_fast_unload(struct mlx5_core_dev *dev) /* Panic tear down fw command will stop the PCI bus communication * with the HCA, so the health poll is no longer needed. */ - mlx5_drain_health_wq(dev); mlx5_stop_health_poll(dev, false); ret = mlx5_cmd_fast_teardown_hca(dev); @@ -2175,6 +2174,7 @@ static void shutdown(struct pci_dev *pdev) mlx5_core_info(dev, "Shutdown was called\n"); set_bit(MLX5_BREAK_FW_WAIT, &dev->intf_state); + mlx5_drain_health_wq(dev); err = mlx5_try_fast_unload(dev); if (err) mlx5_unload_one(dev, false); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c index b2986175d9afe..b706f1486504a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c @@ -112,6 +112,7 @@ static void mlx5_sf_dev_shutdown(struct auxiliary_device *adev) struct mlx5_core_dev *mdev = sf_dev->mdev; set_bit(MLX5_BREAK_FW_WAIT, &mdev->intf_state); + mlx5_drain_health_wq(mdev); mlx5_unload_one(mdev, false); }