On 1/6/2025 11:14 AM, Chenguang Zhao wrote:
The cmd_work_handler function returns from the child function
cmd_alloc_index because the allocate command entry fails,
Before returning, there is no complete ent->slotted.
nit : s/Before/before
The patch fixes it.
Trace:
mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to
allocate command entry
INFO: task kworker/13:2:4055883 blocked for more than 120 seconds.
Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
kworker/13:2 D 0 4055883 2 0x00000228
Workqueue: events mlx5e_tx_dim_work [mlx5_core]
Call trace:
__switch_to+0xe8/0x150
__schedule+0x2a8/0x9b8
schedule+0x2c/0x88
schedule_timeout+0x204/0x478
wait_for_common+0x154/0x250
wait_for_completion+0x28/0x38
cmd_exec+0x7a0/0xa00 [mlx5_core]
mlx5_cmd_exec+0x54/0x80 [mlx5_core]
mlx5_core_modify_cq+0x6c/0x80 [mlx5_core]
mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core]
mlx5e_tx_dim_work+0x54/0x68 [mlx5_core]
process_one_work+0x1b0/0x448
worker_thread+0x54/0x468
kthread+0x134/0x138
ret_from_fork+0x10/0x18
Signed-off-by: Chenguang Zhao <zhaochenguang@xxxxxxxxxx>
Thanks for your fix!
Please add the following fixes tag:
Fixes: 485d65e13571 ("net/mlx5: Add a timeout to acquire the command
queue semaphore")
Reviewed-by: Moshe Shemesh <moshe@xxxxxxxxxx>
---
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 6bd8a18e3af3..e733b81e18a2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -1013,6 +1013,7 @@ static void cmd_work_handler(struct work_struct *work)
complete(&ent->done);
}
up(&cmd->vars.sem);
+ complete(&ent->slotted);
return;
}
} else {