On Mon, Feb 05, 2024 at 05:09:09PM -0800, Rahul Rameshbabu wrote: > On Tue, 06 Feb, 2024 01:03:11 +0000 Joe Damato <jdamato@xxxxxxxxxx> wrote: > > Make mlx5 compatible with the newly added netlink queue GET APIs. > > > > Signed-off-by: Joe Damato <jdamato@xxxxxxxxxx> > > --- > > drivers/net/ethernet/mellanox/mlx5/core/en.h | 1 + > > drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 8 ++++++++ > > 2 files changed, 9 insertions(+) > > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h > > index 55c6ace0acd5..3f86ee1831a8 100644 > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h > > @@ -768,6 +768,7 @@ struct mlx5e_channel { > > u16 qos_sqs_size; > > u8 num_tc; > > u8 lag_port; > > + unsigned int irq; > > > > /* XDP_REDIRECT */ > > struct mlx5e_xdpsq xdpsq; > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > > index c8e8f512803e..e1bfff1fb328 100644 > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c > > @@ -2473,6 +2473,9 @@ static void mlx5e_close_queues(struct mlx5e_channel *c) > > mlx5e_close_tx_cqs(c); > > mlx5e_close_cq(&c->icosq.cq); > > mlx5e_close_cq(&c->async_icosq.cq); > > + > > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, NULL); > > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, NULL); > > This should be set to NULL *before* actually closing the rqs, sqs, and > related cqs right? I would expect these two lines to be the first ones > called in mlx5e_close_queues. Btw, I think this should be done in > mlx5e_deactivate_channel where the NAPI is disabled. > > > } > > > > static u8 mlx5e_enumerate_lag_port(struct mlx5_core_dev *mdev, int ix) > > @@ -2558,6 +2561,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix, > > c->stats = &priv->channel_stats[ix]->ch; > > c->aff_mask = irq_get_effective_affinity_mask(irq); > > c->lag_port = mlx5e_enumerate_lag_port(priv->mdev, ix); > > + c->irq = irq; > > > > netif_napi_add(netdev, &c->napi, mlx5e_napi_poll); > > > > @@ -2602,6 +2606,10 @@ static void mlx5e_activate_channel(struct mlx5e_channel *c) > > mlx5e_activate_xsk(c); > > else > > mlx5e_activate_rq(&c->rq); > > + > > + netif_napi_set_irq(&c->napi, c->irq); > > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_TX, &c->napi); > > + netif_queue_set_napi(c->netdev, c->ix, NETDEV_QUEUE_TYPE_RX, &c->napi); > > It's weird that netlink queue API is being configured in > mlx5e_activate_channel and deconfigured in mlx5e_close_queues. This > leads to a problem where the napi will be falsely referred to even when > we deactivate the channels in mlx5e_switch_priv_channels and may not > necessarily get to closing the channels due to an error. > > Typically, we use the following clean up patterns. > > mlx5e_activate_channel -> mlx5e_deactivate_channel > mlx5e_open_queues -> mlx5e_close_queues OK, I'll move it to mlx5e_deactivate_channel before the NAPI is disabled. That makes sense to me.