On Thu, 9 May 2019 18:26:59 +0200 Pierre Morel <pmorel@xxxxxxxxxxxxx> wrote: > On 09/05/2019 11:06, Cornelia Huck wrote: > > [vfio-ap folks: find a question regarding removal further down] > > > > On Wed, 8 May 2019 22:06:48 +0000 > > Parav Pandit <parav@xxxxxxxxxxxx> wrote: > > > >>> -----Original Message----- > >>> From: Cornelia Huck <cohuck@xxxxxxxxxx> > >>> Sent: Wednesday, May 8, 2019 12:10 PM > >>> To: Parav Pandit <parav@xxxxxxxxxxxx> > >>> Cc: kvm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; > >>> kwankhede@xxxxxxxxxx; alex.williamson@xxxxxxxxxx; cjia@xxxxxxxxxx > >>> Subject: Re: [PATCHv2 08/10] vfio/mdev: Improve the create/remove > >>> sequence > >>> > >>> On Tue, 30 Apr 2019 17:49:35 -0500 > >>> Parav Pandit <parav@xxxxxxxxxxxx> wrote: > >>> > > ...snip... > > >>>> @@ -373,16 +330,15 @@ int mdev_device_remove(struct device *dev, > >>> bool force_remove) > >>>> mutex_unlock(&mdev_list_lock); > >>>> > >>>> type = to_mdev_type(mdev->type_kobj); > >>>> + mdev_remove_sysfs_files(dev, type); > >>>> + device_del(&mdev->dev); > >>>> parent = mdev->parent; > >>>> + ret = parent->ops->remove(mdev); > >>>> + if (ret) > >>>> + dev_err(&mdev->dev, "Remove failed: err=%d\n", ret); > >>> > >>> I think carrying on with removal regardless of the return code of the > >>> ->remove callback makes sense, as it simply matches usual practice. > >>> However, are we sure that every vendor driver works well with that? I think > >>> it should, as removal from bus unregistration (vs. from the sysfs > >>> file) was always something it could not veto, but have you looked at the > >>> individual drivers? > >>> > >> I looked at following drivers a little while back. > >> Looked again now. > >> > >> drivers/gpu/drm/i915/gvt/kvmgt.c which clears the handle valid in intel_vgpu_release(), which should finish first before remove() is invoked. > >> > >> s390 vfio_ccw_mdev_remove() driver drivers/s390/cio/vfio_ccw_ops.c remove() always returns 0. > >> s39 crypo fails the remove() once vfio_ap_mdev_release marks kvm null, which should finish before remove() is invoked. > > > > That one is giving me a bit of a headache (the ->kvm reference is > > supposed to keep us from detaching while a vm is running), so let's cc: > > the vfio-ap maintainers to see whether they have any concerns. > > > > We are aware of this race and we did correct this in the IRQ patches for > which it would have become a real issue. > We now increment/decrement the KVM reference counter inside open and > release. > Should be right after this. > Tony, what is your take on this? I don't have the bandwidth to think this through properly, but my intuition tells me: this might be more complicated than what Pierre's response suggests. Regards, Halil