RE: [PATCH 1/2] ice: Fix race conditions between virtchnl handling and VF ndo ops

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Greg KH <greg@xxxxxxxxx>
> Sent: Saturday, March 05, 2022 5:40 AM
> To: Keller, Jacob E <jacob.e.keller@xxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH 1/2] ice: Fix race conditions between virtchnl handling and VF
> ndo ops
> 
> On Mon, Feb 28, 2022 at 12:46:59PM -0800, Jacob Keller wrote:
> > From: Brett Creeley <brett.creeley@xxxxxxxxx>
> >
> > commit e6ba5273d4ede03d075d7a116b8edad1f6115f4d upstream.
> >
> > [I had to fix the cherry-pick manually as the patch added a line around
> > some context that was missing.]
> >
> > The VF can be configured via the PF's ndo ops at the same time the PF is
> > receiving/handling virtchnl messages. This has many issues, with
> > one of them being the ndo op could be actively resetting a VF (i.e.
> > resetting it to the default state and deleting/re-adding the VF's VSI)
> > while a virtchnl message is being handled. The following error was seen
> > because a VF ndo op was used to change a VF's trust setting while the
> > VIRTCHNL_OP_CONFIG_VSI_QUEUES was ongoing:
> >
> > [35274.192484] ice 0000:88:00.0: Failed to set LAN Tx queue context, error:
> ICE_ERR_PARAM
> > [35274.193074] ice 0000:88:00.0: VF 0 failed opcode 6, retval: -5
> > [35274.193640] iavf 0000:88:01.0: PF returned error -5 (IAVF_ERR_PARAM) to
> our request 6
> >
> > Fix this by making sure the virtchnl handling and VF ndo ops that
> > trigger VF resets cannot run concurrently. This is done by adding a
> > struct mutex cfg_lock to each VF structure. For VF ndo ops, the mutex
> > will be locked around the critical operations and VFR. Since the ndo ops
> > will trigger a VFR, the virtchnl thread will use mutex_trylock(). This
> > is done because if any other thread (i.e. VF ndo op) has the mutex, then
> > that means the current VF message being handled is no longer valid, so
> > just ignore it.
> >
> > This issue can be seen using the following commands:
> >
> > for i in {0..50}; do
> >         rmmod ice
> >         modprobe ice
> >
> >         sleep 1
> >
> >         echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
> >         echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
> >
> >         ip link set ens785f1 vf 0 trust on
> >         ip link set ens785f0 vf 0 trust on
> >
> >         sleep 2
> >
> >         echo 0 > /sys/class/net/ens785f0/device/sriov_numvfs
> >         echo 0 > /sys/class/net/ens785f1/device/sriov_numvfs
> >         sleep 1
> >         echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
> >         echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
> >
> >         ip link set ens785f1 vf 0 trust on
> >         ip link set ens785f0 vf 0 trust on
> > done
> >
> > Fixes: 7c710869d64e ("ice: Add handlers for VF netdevice operations")
> > Cc: <stable@xxxxxxxxxxxxxxx> # 5.14.x
> > Signed-off-by: Brett Creeley <brett.creeley@xxxxxxxxx>
> > Tested-by: Konrad Jankowski <konrad0.jankowski@xxxxxxxxx>
> > Signed-off-by: Tony Nguyen <anthony.l.nguyen@xxxxxxxxx>
> > Signed-off-by: Jacob Keller <jacob.e.keller@xxxxxxxxx>
> > ---
> > This should apply to 5.14.x
> 
> 5.14 is long end-of-life, always look at the kernel.org page if you are
> curious what the "active" kernel trees are.
> 
> thanks,
> 
> greg k-h

My mistake. I apologize for the wasted time here.

Thanks,
Jake




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux