This is a note to let you know that I've just added the patch titled net/mlx5: Disable RoCE on the e-switch management port under switchdev mode to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: net-mlx5-disable-roce-on-the-e-switch-management-port-under-switchdev-mode.patch and it can be found in the queue-4.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From foo@baz Fri Aug 4 13:32:40 PDT 2017 From: Or Gerlitz <ogerlitz@xxxxxxxxxxxx> Date: Wed, 28 Dec 2016 14:58:31 +0200 Subject: net/mlx5: Disable RoCE on the e-switch management port under switchdev mode From: Or Gerlitz <ogerlitz@xxxxxxxxxxxx> [ Upstream commit 9da34cd34e85aacc55af8774b81b1f23e86014f9 ] Under the switchdev/offloads mode, packets that don't match any e-switch steering rule are sent towards the e-switch management port. We use a NIC HW steering rule set per vport (uplink and VFs) to make them be received into the host OS through the respective vport representor netdevice. Currnetly such missed RoCE packets will not get to this NIC steering rule, and hence VF RoCE will not work over the slow path of the offloads mode. This is b/c these packets will be matched by a steering rule added by the firmware that serves RoCE traffic set on the PF NIC vport which is also the e-switch management port under SRIOV. Disabling RoCE on the e-switch management vport when we are in the offloads mode, will signal to the firmware to remove their RoCE rule, and then the missed RoCE packets will be matched by the representor NIC steering rule as any other missed packets. To achieve that, we disable RoCE on the PF vport. We do that by removing (hot-unplugging) the IB device instance associated with the PF. This is also required by our current model where the PF serves as the uplink representor and hence only SW switching (TC, bridge, OVS) applications and slow path vport mlx5e net-device should be running over that vport. Fixes: c930a3ad7453 ('net/mlx5e: Add devlink based SRIOV mode changes') Signed-off-by: Or Gerlitz <ogerlitz@xxxxxxxxxxxx> Reviewed-by: Hadar Hen Zion <hadarh@xxxxxxxxxxxx> Signed-off-by: Saeed Mahameed <saeedm@xxxxxxxxxxxx> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 11 +++++++++++ 1 file changed, 11 insertions(+) --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -672,6 +672,12 @@ int esw_offloads_init(struct mlx5_eswitc if (err) goto err_reps; } + + /* disable PF RoCE so missed packets don't go through RoCE steering */ + mlx5_dev_list_lock(); + mlx5_remove_dev_by_protocol(esw->dev, MLX5_INTERFACE_PROTOCOL_IB); + mlx5_dev_list_unlock(); + return 0; err_reps: @@ -695,6 +701,11 @@ static int esw_offloads_stop(struct mlx5 { int err, err1, num_vfs = esw->dev->priv.sriov.num_vfs; + /* enable back PF RoCE */ + mlx5_dev_list_lock(); + mlx5_add_dev_by_protocol(esw->dev, MLX5_INTERFACE_PROTOCOL_IB); + mlx5_dev_list_unlock(); + mlx5_eswitch_disable_sriov(esw); err = mlx5_eswitch_enable_sriov(esw, num_vfs, SRIOV_LEGACY); if (err) { Patches currently in stable-queue which might be from ogerlitz@xxxxxxxxxxxx are queue-4.9/net-mlx5-disable-roce-on-the-e-switch-management-port-under-switchdev-mode.patch