Patch "net/mlx5: Fix missing lock on sync reset reload" has been added to the 6.6-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    net/mlx5: Fix missing lock on sync reset reload

to the 6.6-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     net-mlx5-fix-missing-lock-on-sync-reset-reload.patch
and it can be found in the queue-6.6 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit ce87eab7bd0e07dee18ff45626a01b934c3c1362
Author: Moshe Shemesh <moshe@xxxxxxxxxx>
Date:   Tue Jul 30 09:16:34 2024 +0300

    net/mlx5: Fix missing lock on sync reset reload
    
    [ Upstream commit 572f9caa9e7295f8c8822e4122c7ae8f1c412ff9 ]
    
    On sync reset reload work, when remote host updates devlink on reload
    actions performed on that host, it misses taking devlink lock before
    calling devlink_remote_reload_actions_performed() which results in
    triggering lock assert like the following:
    
    WARNING: CPU: 4 PID: 1164 at net/devlink/core.c:261 devl_assert_locked+0x3e/0x50
    …
     CPU: 4 PID: 1164 Comm: kworker/u96:6 Tainted: G S      W          6.10.0-rc2+ #116
     Hardware name: Supermicro SYS-2028TP-DECTR/X10DRT-PT, BIOS 2.0 12/18/2015
     Workqueue: mlx5_fw_reset_events mlx5_sync_reset_reload_work [mlx5_core]
     RIP: 0010:devl_assert_locked+0x3e/0x50
    …
     Call Trace:
      <TASK>
      ? __warn+0xa4/0x210
      ? devl_assert_locked+0x3e/0x50
      ? report_bug+0x160/0x280
      ? handle_bug+0x3f/0x80
      ? exc_invalid_op+0x17/0x40
      ? asm_exc_invalid_op+0x1a/0x20
      ? devl_assert_locked+0x3e/0x50
      devlink_notify+0x88/0x2b0
      ? mlx5_attach_device+0x20c/0x230 [mlx5_core]
      ? __pfx_devlink_notify+0x10/0x10
      ? process_one_work+0x4b6/0xbb0
      process_one_work+0x4b6/0xbb0
    […]
    
    Fixes: 84a433a40d0e ("net/mlx5: Lock mlx5 devlink reload callbacks")
    Signed-off-by: Moshe Shemesh <moshe@xxxxxxxxxx>
    Reviewed-by: Maor Gottlieb <maorg@xxxxxxxxxx>
    Signed-off-by: Tariq Toukan <tariqt@xxxxxxxxxx>
    Reviewed-by: Wojciech Drewek <wojciech.drewek@xxxxxxxxx>
    Link: https://patch.msgid.link/20240730061638.1831002-6-tariqt@xxxxxxxxxx
    Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
index 3a9cdf79403ae..6b17346aa4cef 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
@@ -206,6 +206,7 @@ int mlx5_fw_reset_set_live_patch(struct mlx5_core_dev *dev)
 static void mlx5_fw_reset_complete_reload(struct mlx5_core_dev *dev, bool unloaded)
 {
 	struct mlx5_fw_reset *fw_reset = dev->priv.fw_reset;
+	struct devlink *devlink = priv_to_devlink(dev);
 
 	/* if this is the driver that initiated the fw reset, devlink completed the reload */
 	if (test_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags)) {
@@ -217,9 +218,11 @@ static void mlx5_fw_reset_complete_reload(struct mlx5_core_dev *dev, bool unload
 			mlx5_core_err(dev, "reset reload flow aborted, PCI reads still not working\n");
 		else
 			mlx5_load_one(dev, true);
-		devlink_remote_reload_actions_performed(priv_to_devlink(dev), 0,
+		devl_lock(devlink);
+		devlink_remote_reload_actions_performed(devlink, 0,
 							BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT) |
 							BIT(DEVLINK_RELOAD_ACTION_FW_ACTIVATE));
+		devl_unlock(devlink);
 	}
 }
 




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux