On Tue, Feb 14, 2023 at 10:24:04PM +0000, Ertman, David M wrote: > > -----Original Message----- > > From: Leon Romanovsky <leonro@xxxxxxxxxx> > > Sent: Wednesday, February 1, 2023 1:50 AM > > Subject: Re: [PATCH net 1/6] ice: avoid bonding causing auxiliary plug/unplug > > under RTNL lock > > > > On Tue, Jan 31, 2023 at 01:36:58PM -0800, Tony Nguyen wrote: > > > From: Dave Ertman <david.m.ertman@xxxxxxxxx> > > > > > > RDMA is not supported in ice on a PF that has been added to a bonded > > > interface. To enforce this, when an interface enters a bond, we unplug > > > the auxiliary device that supports RDMA functionality. This unplug > > > currently happens in the context of handling the netdev bonding event. > > > This event is sent to the ice driver under RTNL context. This is causing > > > a deadlock where the RDMA driver is waiting for the RTNL lock to complete > > > the removal. > > > > > > Defer the unplugging/re-plugging of the auxiliary device to the service > > > task so that it is not performed under the RTNL lock context. > > > > > > Reported-by: Jaroslav Pulchart <jaroslav.pulchart@xxxxxxxxxxxx> > > > Link: https://lore.kernel.org/linux-rdma/68b14b11-d0c7-65c9-4eeb- > > 0487c95e395d@xxxxxxxxxxxxx/ > > > Fixes: 5cb1ebdbc434 ("ice: Fix race condition during interface enslave") > > > Fixes: 4eace75e0853 ("RDMA/irdma: Report the correct link speed") > > > Signed-off-by: Dave Ertman <david.m.ertman@xxxxxxxxx> > > > Reviewed-by: Michal Swiatkowski <michal.swiatkowski@xxxxxxxxxxxxxxx> > > > Tested-by: Gurucharan G <gurucharanx.g@xxxxxxxxx> (A Contingent > > worker at Intel) > > > Signed-off-by: Tony Nguyen <anthony.l.nguyen@xxxxxxxxx> > > > --- > > > drivers/net/ethernet/intel/ice/ice.h | 14 +++++--------- > > > drivers/net/ethernet/intel/ice/ice_main.c | 17 +++++++---------- > > > 2 files changed, 12 insertions(+), 19 deletions(-) > > > > <...> > > > > > index 5f86e4111fa9..055494dbcce0 100644 > > > --- a/drivers/net/ethernet/intel/ice/ice_main.c > > > +++ b/drivers/net/ethernet/intel/ice/ice_main.c > > > @@ -2290,18 +2290,15 @@ static void ice_service_task(struct work_struct > > *work) > > > } > > > } > > > > > > - if (test_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags)) { > > > - /* Plug aux device per request */ > > > + /* Plug aux device per request */ > > > + if (test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags)) > > > > Very interesting pattern. You are not holding any locks while running > > ice_service_task() and clear bits before you actually performed requested > > operation. > > > > How do you protect from races while testing bits in other places of ice > > driver? > > Leon, > > Thanks for the review and sorry for the late reply, got sidetracked into another project. > > Your review caused us to re-evaluate the plug/unplug flow, and since these bits are only set/cleared in > the bonding event flow, and the UNPLUG bit set clears the PLUG bit, we attain the desired outcome > in all cases if we swap the order that we evaluate the bits in the service task. I afraid that it won't make ice state machine more understandable. :) Thanks > > Any multi-event situation that happens between or during service task will be handled in the expected way. > > DaveE > > > > > Thanks > > > > > ice_plug_aux_dev(pf); > > > > > > - /* Mark plugging as done but check whether unplug was > > > - * requested during ice_plug_aux_dev() call > > > - * (e.g. from ice_clear_rdma_cap()) and if so then > > > - * plug aux device. > > > - */ > > > - if (!test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf- > > >flags)) > > > - ice_unplug_aux_dev(pf); > > > - } > > > + /* unplug aux dev per request, if an unplug request came in > > > + * while processing a plug request, this will handle it > > > + */ > > > + if (test_and_clear_bit(ICE_FLAG_UNPLUG_AUX_DEV, pf->flags)) > > > + ice_unplug_aux_dev(pf); > > > > > > if (test_and_clear_bit(ICE_FLAG_MTU_CHANGED, pf->flags)) { > > > struct iidc_event *event; > > > -- > > > 2.38.1 > > >