On Mon, Nov 20, 2023 at 05:38:01PM +0200, Vlad Buslov wrote:
On Fri 03 Nov 2023 at 23:03, Sasha Levin <sashal@xxxxxxxxxx> wrote:
This is a note to let you know that I've just added the patch titled
net/mlx5: Bridge, fix peer entry ageing in LAG mode
to the 6.5-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
The filename of the patch is:
net-mlx5-bridge-fix-peer-entry-ageing-in-lag-mode.patch
and it can be found in the queue-6.5 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.
commit 5fcb4295eb48140cffce968caf6fcd754f1696e9
Author: Vlad Buslov <vladbu@xxxxxxxxxx>
Date: Wed Aug 9 11:10:57 2023 +0200
net/mlx5: Bridge, fix peer entry ageing in LAG mode
[ Upstream commit 7a3ce8074878a68a75ceacec93d9ae05906eec86 ]
With current implementation in single FDB LAG mode all packets are
processed by eswitch 0 rules. As such, 'peer' FDB entries receive the
packets for rules of other eswitches and are responsible for updating the
main entry by sending SWITCHDEV_FDB_ADD_TO_BRIDGE notification from their
background update wq task. However, this introduces a race condition when
non-zero eswitch instance decides to delete a FDB entry, sends
SWITCHDEV_FDB_DEL_TO_BRIDGE notification, but another eswitch's update task
refreshes the same entry concurrently while its async delete work is still
pending on the workque. In such case another SWITCHDEV_FDB_ADD_TO_BRIDGE
event may be generated and entry will remain stuck in FDB marked as
'offloaded' since no more SWITCHDEV_FDB_DEL_TO_BRIDGE notifications are
sent for deleting the peer entries.
Fix the issue by synchronously marking deleted entries with
MLX5_ESW_BRIDGE_FLAG_DELETED flag and skipping them in background update
job.
Signed-off-by: Vlad Buslov <vladbu@xxxxxxxxxx>
Reviewed-by: Jianbo Liu <jianbol@xxxxxxxxxx>
Signed-off-by: Saeed Mahameed <saeedm@xxxxxxxxxx>
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
Hi Sasha,
Could you also take this to 5.15 and 6.1?
Happily, but I see a build error when cherry picked on top of those
trees. Please send a backport and we'll add it in.
--
Thanks,
Sasha