This is a note to let you know that I've just added the patch titled net/mlx5: ECPF, wait for VF pages only after disabling host PFs to the 6.1-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: net-mlx5-ecpf-wait-for-vf-pages-only-after-disabling.patch and it can be found in the queue-6.1 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. commit f5e49d7472b4690478678031e0ee5bd155618444 Author: Maher Sanalla <msanalla@xxxxxxxxxx> Date: Wed Feb 15 20:12:05 2023 +0200 net/mlx5: ECPF, wait for VF pages only after disabling host PFs [ Upstream commit e1ed30c8c09abc85a01c897845bdbd08c0333353 ] Currently, during the early stages of their unloading, particularly during SRIOV disablement, PFs/ECPFs wait on the release of all of their VFs memory pages. Furthermore, ECPFs are considered the page supplier for host VFs, hence the host VFs memory pages are freed only during ECPF cleanup when host interfaces get disabled. Thus, disabling SRIOV early in unload timeline causes the DPU ECPF to stall on driver unload while waiting on the release of host VF pages that won't be freed before host interfaces get disabled later on. Therefore, for ECPFs, wait on the release of VFs pages only after the disablement of host PFs during ECPF cleanup flow. Then, host PFs and VFs are disabled and their memory shall be freed accordingly. Fixes: 143a41d7623d ("net/mlx5: Disable SRIOV before PF removal") Signed-off-by: Maher Sanalla <msanalla@xxxxxxxxxx> Reviewed-by: Moshe Shemesh <moshe@xxxxxxxxxx> Signed-off-by: Saeed Mahameed <saeedm@xxxxxxxxxx> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ecpf.c b/drivers/net/ethernet/mellanox/mlx5/core/ecpf.c index cdc87ecae5d39..d000236ddbac5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/ecpf.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/ecpf.c @@ -90,4 +90,8 @@ void mlx5_ec_cleanup(struct mlx5_core_dev *dev) err = mlx5_wait_for_pages(dev, &dev->priv.page_counters[MLX5_HOST_PF]); if (err) mlx5_core_warn(dev, "Timeout reclaiming external host PF pages err(%d)\n", err); + + err = mlx5_wait_for_pages(dev, &dev->priv.page_counters[MLX5_VF]); + if (err) + mlx5_core_warn(dev, "Timeout reclaiming external host VFs pages err(%d)\n", err); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c index 3008e9ce2bbff..20d7662c10fb6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c @@ -147,6 +147,10 @@ mlx5_device_disable_sriov(struct mlx5_core_dev *dev, int num_vfs, bool clear_vf) mlx5_eswitch_disable_sriov(dev->priv.eswitch, clear_vf); + /* For ECPFs, skip waiting for host VF pages until ECPF is destroyed */ + if (mlx5_core_is_ecpf(dev)) + return; + if (mlx5_wait_for_pages(dev, &dev->priv.page_counters[MLX5_VF])) mlx5_core_warn(dev, "timeout reclaiming VFs pages\n"); }