Patch "net/mlx5e: Wrap the tx reporter dump callback to extract the sq" has been added to the 5.15-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    net/mlx5e: Wrap the tx reporter dump callback to extract the sq

to the 5.15-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     net-mlx5e-wrap-the-tx-reporter-dump-callback-to-extr.patch
and it can be found in the queue-5.15 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit e09777ed4c1e36dadf6cd8ea95a2f4f1e4612f3e
Author: Amir Tzin <amirtz@xxxxxxxxxx>
Date:   Tue Nov 30 16:05:44 2021 +0200

    net/mlx5e: Wrap the tx reporter dump callback to extract the sq
    
    [ Upstream commit 918fc3855a6507a200e9cf22c20be852c0982687 ]
    
    Function mlx5e_tx_reporter_dump_sq() casts its void * argument to struct
    mlx5e_txqsq *, but in TX-timeout-recovery flow the argument is actually
    of type struct mlx5e_tx_timeout_ctx *.
    
     mlx5_core 0000:08:00.1 enp8s0f1: TX timeout detected
     mlx5_core 0000:08:00.1 enp8s0f1: TX timeout on queue: 1, SQ: 0x11ec, CQ: 0x146d, SQ Cons: 0x0 SQ Prod: 0x1, usecs since last trans: 21565000
     BUG: stack guard page was hit at 0000000093f1a2de (stack is 00000000b66ea0dc..000000004d932dae)
     kernel stack overflow (page fault): 0000 [#1] SMP NOPTI
     CPU: 5 PID: 95 Comm: kworker/u20:1 Tainted: G W OE 5.13.0_mlnx #1
     Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
     Workqueue: mlx5e mlx5e_tx_timeout_work [mlx5_core]
     RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180
     [mlx5_core]
     Call Trace:
     mlx5e_tx_reporter_dump+0x43/0x1c0 [mlx5_core]
     devlink_health_do_dump.part.91+0x71/0xd0
     devlink_health_report+0x157/0x1b0
     mlx5e_reporter_tx_timeout+0xb9/0xf0 [mlx5_core]
     ? mlx5e_tx_reporter_err_cqe_recover+0x1d0/0x1d0
     [mlx5_core]
     ? mlx5e_health_queue_dump+0xd0/0xd0 [mlx5_core]
     ? update_load_avg+0x19b/0x550
     ? set_next_entity+0x72/0x80
     ? pick_next_task_fair+0x227/0x340
     ? finish_task_switch+0xa2/0x280
       mlx5e_tx_timeout_work+0x83/0xb0 [mlx5_core]
       process_one_work+0x1de/0x3a0
       worker_thread+0x2d/0x3c0
     ? process_one_work+0x3a0/0x3a0
       kthread+0x115/0x130
     ? kthread_park+0x90/0x90
       ret_from_fork+0x1f/0x30
     --[ end trace 51ccabea504edaff ]---
     RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180
     PKRU: 55555554
     Kernel panic - not syncing: Fatal exception
     Kernel Offset: disabled
     end Kernel panic - not syncing: Fatal exception
    
    To fix this bug add a wrapper for mlx5e_tx_reporter_dump_sq() which
    extracts the sq from struct mlx5e_tx_timeout_ctx and set it as the
    TX-timeout-recovery flow dump callback.
    
    Fixes: 5f29458b77d5 ("net/mlx5e: Support dump callback in TX reporter")
    Signed-off-by: Aya Levin <ayal@xxxxxxxxxx>
    Signed-off-by: Amir Tzin <amirtz@xxxxxxxxxx>
    Signed-off-by: Saeed Mahameed <saeedm@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
index bb682fd751c98..8024599994642 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
@@ -463,6 +463,14 @@ static int mlx5e_tx_reporter_dump_sq(struct mlx5e_priv *priv, struct devlink_fms
 	return mlx5e_health_fmsg_named_obj_nest_end(fmsg);
 }
 
+static int mlx5e_tx_reporter_timeout_dump(struct mlx5e_priv *priv, struct devlink_fmsg *fmsg,
+					  void *ctx)
+{
+	struct mlx5e_tx_timeout_ctx *to_ctx = ctx;
+
+	return mlx5e_tx_reporter_dump_sq(priv, fmsg, to_ctx->sq);
+}
+
 static int mlx5e_tx_reporter_dump_all_sqs(struct mlx5e_priv *priv,
 					  struct devlink_fmsg *fmsg)
 {
@@ -558,7 +566,7 @@ int mlx5e_reporter_tx_timeout(struct mlx5e_txqsq *sq)
 	to_ctx.sq = sq;
 	err_ctx.ctx = &to_ctx;
 	err_ctx.recover = mlx5e_tx_reporter_timeout_recover;
-	err_ctx.dump = mlx5e_tx_reporter_dump_sq;
+	err_ctx.dump = mlx5e_tx_reporter_timeout_dump;
 	snprintf(err_str, sizeof(err_str),
 		 "TX timeout on queue: %d, SQ: 0x%x, CQ: 0x%x, SQ Cons: 0x%x SQ Prod: 0x%x, usecs since last trans: %u",
 		 sq->ch_ix, sq->sqn, sq->cq.mcq.cqn, sq->cc, sq->pc,




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux