This is a note to let you know that I've just added the patch titled mlxsw: core: Use variable timeout for EMAD retries to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: mlxsw-core-use-variable-timeout-for-emad-retries.patch and it can be found in the queue-4.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From foo@baz Sun Nov 22 12:12:04 PM CET 2020 From: Ido Schimmel <idosch@xxxxxxxxxx> Date: Tue, 17 Nov 2020 19:33:52 +0200 Subject: mlxsw: core: Use variable timeout for EMAD retries From: Ido Schimmel <idosch@xxxxxxxxxx> [ Upstream commit 1f492eab67bced119a0ac7db75ef2047e29a30c6 ] The driver sends Ethernet Management Datagram (EMAD) packets to the device for configuration purposes and waits for up to 200ms for a reply. A request is retried up to 5 times. When the system is under heavy load, replies are not always processed in time and EMAD transactions fail. Make the process more robust to such delays by using exponential backoff. First wait for up to 200ms, then retransmit and wait for up to 400ms and so on. Fixes: caf7297e7ab5 ("mlxsw: core: Introduce support for asynchronous EMAD register access") Reported-by: Denis Yulevich <denisyu@xxxxxxxxxx> Tested-by: Denis Yulevich <denisyu@xxxxxxxxxx> Signed-off-by: Ido Schimmel <idosch@xxxxxxxxxx> Reviewed-by: Jiri Pirko <jiri@xxxxxxxxxx> Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/net/ethernet/mellanox/mlxsw/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/net/ethernet/mellanox/mlxsw/core.c +++ b/drivers/net/ethernet/mellanox/mlxsw/core.c @@ -436,7 +436,7 @@ static void mlxsw_emad_trans_timeout_sch { unsigned long timeout = msecs_to_jiffies(MLXSW_EMAD_TIMEOUT_MS); - mlxsw_core_schedule_dw(&trans->timeout_dw, timeout); + mlxsw_core_schedule_dw(&trans->timeout_dw, timeout << trans->retries); } static int mlxsw_emad_transmit(struct mlxsw_core *mlxsw_core, Patches currently in stable-queue which might be from idosch@xxxxxxxxxx are queue-4.9/mlxsw-core-use-variable-timeout-for-emad-retries.patch