From: Alaa Hleihel <alaa@xxxxxxxxxxxx> Prior to reloading a device we must first verify that it was not already removed. Otherwise, the attempt to remove the device will do nothing, and in that case we will end up proceeding with adding an new device that no one was expecting to remove, leaving behind used resources such as EQs that causes a failure to destroy comp EQs and syndrome (0x30f433). Fix that by making sure that we try to remove and add a device (based on a protocol) only if the device is already added. Fixes: c5447c70594b ("net/mlx5: E-Switch, Reload IB interface when switching devlink modes") Signed-off-by: Alaa Hleihel <alaa@xxxxxxxxxxxx> Signed-off-by: Saeed Mahameed <saeedm@xxxxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/net/ethernet/mellanox/mlx5/core/dev.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-) --- a/drivers/net/ethernet/mellanox/mlx5/core/dev.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/dev.c @@ -342,11 +342,32 @@ void mlx5_unregister_interface(struct ml } EXPORT_SYMBOL(mlx5_unregister_interface); +/* Must be called with intf_mutex held */ +static bool mlx5_has_added_dev_by_protocol(struct mlx5_core_dev *mdev, int protocol) +{ + struct mlx5_device_context *dev_ctx; + struct mlx5_interface *intf; + bool found = false; + + list_for_each_entry(intf, &intf_list, list) { + if (intf->protocol == protocol) { + dev_ctx = mlx5_get_device(intf, &mdev->priv); + if (dev_ctx && test_bit(MLX5_INTERFACE_ADDED, &dev_ctx->state)) + found = true; + break; + } + } + + return found; +} + void mlx5_reload_interface(struct mlx5_core_dev *mdev, int protocol) { mutex_lock(&mlx5_intf_mutex); - mlx5_remove_dev_by_protocol(mdev, protocol); - mlx5_add_dev_by_protocol(mdev, protocol); + if (mlx5_has_added_dev_by_protocol(mdev, protocol)) { + mlx5_remove_dev_by_protocol(mdev, protocol); + mlx5_add_dev_by_protocol(mdev, protocol); + } mutex_unlock(&mlx5_intf_mutex); }