Re: [PATCH 10/10] vfio/qat: Add vfio_pci driver for Intel QAT VF devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 09, 2024 at 08:23:32AM +0000, Zeng, Xin wrote:
> Thanks for your comments, Jason.
> On Tuesday, February 6, 2024 8:55 PM, Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
> > > +
> > > +	ops = mdev->ops;
> > > +	if (!ops || !ops->init || !ops->cleanup ||
> > > +	    !ops->open || !ops->close ||
> > > +	    !ops->save_state || !ops->load_state ||
> > > +	    !ops->suspend || !ops->resume) {
> > > +		ret = -EIO;
> > > +		dev_err(&parent->dev, "Incomplete device migration ops
> > structure!");
> > > +		goto err_destroy;
> > > +	}
> > 
> > Why are there ops pointers here? I would expect this should just be
> > direct function calls to the PF QAT driver.
> 
> I indeed had a version where the direct function calls are Implemented in
> QAT driver, while when I look at the functions, most of them 
> only translate the interface to the ops pointer. That's why I put
> ops pointers directly into vfio variant driver.

But why is there an ops indirection at all? Are there more than one
ops?

> > > +static void qat_vf_pci_aer_reset_done(struct pci_dev *pdev)
> > > +{
> > > +	struct qat_vf_core_device *qat_vdev = qat_vf_drvdata(pdev);
> > > +
> > > +	if (!qat_vdev->core_device.vdev.mig_ops)
> > > +		return;
> > > +
> > > +	/*
> > > +	 * As the higher VFIO layers are holding locks across reset and using
> > > +	 * those same locks with the mm_lock we need to prevent ABBA
> > deadlock
> > > +	 * with the state_mutex and mm_lock.
> > > +	 * In case the state_mutex was taken already we defer the cleanup work
> > > +	 * to the unlock flow of the other running context.
> > > +	 */
> > > +	spin_lock(&qat_vdev->reset_lock);
> > > +	qat_vdev->deferred_reset = true;
> > > +	if (!mutex_trylock(&qat_vdev->state_mutex)) {
> > > +		spin_unlock(&qat_vdev->reset_lock);
> > > +		return;
> > > +	}
> > > +	spin_unlock(&qat_vdev->reset_lock);
> > > +	qat_vf_state_mutex_unlock(qat_vdev);
> > > +}
> > 
> > Do you really need this? I thought this ugly thing was going to be a
> > uniquely mlx5 thing..
> 
> I think that's still required to make the migration state synchronized
> if the VF is reset by other VFIO emulation paths. Is it the case? 
> BTW, this implementation is not only in mlx5 driver, but also in other
> Vfio pci variant drivers such as hisilicon acc driver and pds
> driver.

It had to specifically do with the mm lock interaction that, I
thought, was going to be unique to the mlx driver. Otherwise you could
just directly hold the state_mutex here.

Yishai do you remember the exact trace for the mmlock entanglement?

Jason




[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]
  Powered by Linux