> -----Original Message----- > From: Saeed Mahameed <saeed@xxxxxxxxxx> > Sent: Friday, December 9, 2022 11:28 AM > To: Ertman, David M <david.m.ertman@xxxxxxxxx> > Cc: Nguyen, Anthony L <anthony.l.nguyen@xxxxxxxxx>; > davem@xxxxxxxxxxxxx; kuba@xxxxxxxxxx; pabeni@xxxxxxxxxx; > edumazet@xxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; Saleem, Shiraz > <shiraz.saleem@xxxxxxxxx>; Ismail, Mustafa <mustafa.ismail@xxxxxxxxx>; > jgg@xxxxxxxxxx; leonro@xxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx; G, > GurucharanX <gurucharanx.g@xxxxxxxxx> > Subject: Re: [PATCH net 2/4] ice: Correctly handle aux device when num > channels change > > On 09 Dec 17:21, Ertman, David M wrote: > >> -----Original Message----- > >> From: Saeed Mahameed <saeed@xxxxxxxxxx> > >> Sent: Wednesday, December 7, 2022 2:26 PM > >> To: Nguyen, Anthony L <anthony.l.nguyen@xxxxxxxxx> > >> Cc: davem@xxxxxxxxxxxxx; kuba@xxxxxxxxxx; pabeni@xxxxxxxxxx; > >> edumazet@xxxxxxxxxx; Ertman, David M <david.m.ertman@xxxxxxxxx>; > >> netdev@xxxxxxxxxxxxxxx; Saleem, Shiraz <shiraz.saleem@xxxxxxxxx>; > Ismail, > >> Mustafa <mustafa.ismail@xxxxxxxxx>; jgg@xxxxxxxxxx; > leonro@xxxxxxxxxx; > >> linux-rdma@xxxxxxxxxxxxxxx; G, GurucharanX > <gurucharanx.g@xxxxxxxxx> > >> Subject: Re: [PATCH net 2/4] ice: Correctly handle aux device when num > >> channels change > >> > >> On 07 Dec 13:10, Tony Nguyen wrote: > >> >From: Dave Ertman <david.m.ertman@xxxxxxxxx> > >> > > >> >When the number of channels/queues changes on an interface, it is > >> necessary > >> >to change how those resources are distributed to the auxiliary device for > >> >maintaining RDMA functionality. To do this, the best way is to unplug, > and > >> > >> Can you please explain how an ethtool can affect RDMA functionality ? > >> don't you have full bifurcation between the two eth and rdma interfaces > .. > >> > >This patch is to address a bug where the number of queues for the > interface was > >changed and the RDMA lost functionality due to queues being re-assigned. > > > > sounds like an rdma or PF bug. > > >The PF is managing and setting aside resources for the RDMA aux dev. Then > the > >RDMA aux driver will request resources from the PF driver. Changes in > >the total number of resources make it so that resources previously > >allocated to RDMA aux driver may not be available any more. A re- > allocation > >is necessary to ensure that RDMA has all of the queues that it thinks it does. > > > > IMO it's wrong to re-initialize a parallel subsystems due to an ethtool, > ethtool is meant to control the netdev interface, not rdma. Either > statically allocate these resources on boot or just store them in a free > pool in PF until next time rdma reloads by an rdma user command outside of > netdev. > > >> >then re-plug the auxiliary device. This will cause all current resource > >> >allocation to be released, and then re-requested under the new state. > >> > > >> > >> I find this really disruptive, changing number of netdev queues to cause > >> full aux devs restart ! > >> > > > >Changing the number of queues available to the interface *is* a disruptive > action. > > yes to the netdev, not to rdma or usb, or the pci bus, you get my point. > > >The netdev and VSI have to be re-configured for queues per TC and the > RDMA aux > >driver has to re-allocate qsets to attach queue-pairs to. > > > > sure for netdev, but it doesn't make sense to reload rdma, if rdma was > using X queues, it should continue using X queues.. if you can't support > dynamic netdev changes without disturbing rdma, then block ethtool until > user unloads rdma. > > >> >Since the set_channel command from ethtool comes in while holding > the > >> RTNL > >> >lock, it is necessary to offset the plugging and unplugging of auxiliary > >> >device to another context. For this purpose, set the flags for UNPLUG > and > >> >PLUG in the PF state, then respond to them in the service task. > >> > > >> >Also, since the auxiliary device will be unplugged/plugged at the end of > >> >the flow, it is better to not send the event for TCs changing in the > >> >middle of the flow. This will prevent a timing issue between the events > >> >and the probe/release calls conflicting. > >> > > >> >Fixes: 348048e724a0 ("ice: Implement iidc operations") > >> >Signed-off-by: Dave Ertman <david.m.ertman@xxxxxxxxx> > >> >Tested-by: Gurucharan G <gurucharanx.g@xxxxxxxxx> (A Contingent > >> worker at Intel) > >> >Signed-off-by: Tony Nguyen <anthony.l.nguyen@xxxxxxxxx> > >> >--- > >> > drivers/net/ethernet/intel/ice/ice.h | 2 ++ > >> > drivers/net/ethernet/intel/ice/ice_ethtool.c | 6 ++++++ > >> > drivers/net/ethernet/intel/ice/ice_idc.c | 3 +++ > >> > drivers/net/ethernet/intel/ice/ice_main.c | 3 +++ > >> > 4 files changed, 14 insertions(+) > >> > > >> >diff --git a/drivers/net/ethernet/intel/ice/ice.h > >> b/drivers/net/ethernet/intel/ice/ice.h > >> >index 001500afc4a6..092e572768fe 100644 > >> >--- a/drivers/net/ethernet/intel/ice/ice.h > >> >+++ b/drivers/net/ethernet/intel/ice/ice.h > >> >@@ -281,6 +281,7 @@ enum ice_pf_state { > >> > ICE_FLTR_OVERFLOW_PROMISC, > >> > ICE_VF_DIS, > >> > ICE_CFG_BUSY, > >> >+ ICE_SET_CHANNELS, > >> > ICE_SERVICE_SCHED, > >> > ICE_SERVICE_DIS, > >> > ICE_FD_FLUSH_REQ, > >> >@@ -485,6 +486,7 @@ enum ice_pf_flags { > >> > ICE_FLAG_VF_VLAN_PRUNING, > >> > ICE_FLAG_LINK_LENIENT_MODE_ENA, > >> > ICE_FLAG_PLUG_AUX_DEV, > >> >+ ICE_FLAG_UNPLUG_AUX_DEV, > >> > ICE_FLAG_MTU_CHANGED, > >> > ICE_FLAG_GNSS, /* GNSS successfully > >> initialized */ > >> > ICE_PF_FLAGS_NBITS /* must be last */ > >> >diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c > >> b/drivers/net/ethernet/intel/ice/ice_ethtool.c > >> >index b7be84bbe72d..37e174a19860 100644 > >> >--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c > >> >+++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c > >> >@@ -3536,6 +3536,8 @@ static int ice_set_channels(struct net_device > >> *dev, struct ethtool_channels *ch) > >> > return -EINVAL; > >> > } > >> > > >> >+ set_bit(ICE_SET_CHANNELS, pf->state); > >> >+ > >> > ice_vsi_recfg_qs(vsi, new_rx, new_tx); > >> > > >> > if (!netif_is_rxfh_configured(dev)) > >> >@@ -3543,6 +3545,10 @@ static int ice_set_channels(struct net_device > >> *dev, struct ethtool_channels *ch) > >> > > >> > /* Update rss_size due to change in Rx queues */ > >> > vsi->rss_size = ice_get_valid_rss_size(&pf->hw, new_rx); > >> >+ clear_bit(ICE_SET_CHANNELS, pf->state); > >> >+ > >> > >> you just set this new state a few lines ago, clearing the bit in the same > >> function few lines later seems to be an abuse of the pf state machine, > >> couldn't you just pass a parameter to the functions which needed this > >> information ? > >> > > > >How is this abusing the PF state machine? There is a 3 deep function call > that needs > >the information that this is a set_channel context, and each of those > functions is called > >from several locations - how is changing all of those functions to include a > parameter > > this is exactly the abuse i was talking about, setting a flag on a global > state field because the function call is too deep. > > >(that will be false for all of them but this instance) be less abusive than > setting and > >clearing a bit? > > this is not future sustainable and not reviewer friendly, it's a local > parameter and shouldn't be a global flag. > > Anyway this is your driver, i am not going to force you to do something you > don't like. > > but for the reloading of rdma due to an ethtool is a no for me.. > let's find a common place for all vendors to express such limitations, > e.g. devlink .. The set_channels function will cause the VSI controlling the pool of resources that RDMA uses to be re-allocated ad reconfigured. So, based on this, I will submit a new patch to prevent set_channels when RDMA is probed as you suggested. Please abandon this patch, new one will be submitted. DaveE > > > > >> >+ set_bit(ICE_FLAG_UNPLUG_AUX_DEV, pf->flags); > >> >+ set_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags); > >> > > >> > return 0; > >> > } > >