> From: Wang, Zhi A <zhi.a.wang@xxxxxxxxx> > Sent: Wednesday, October 19, 2022 5:41 PM > > On 10/6/22 18:31, Alex Williamson wrote: > > On Thu, 6 Oct 2022 08:37:09 -0300 > > Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > > > >> On Wed, Oct 05, 2022 at 04:03:56PM -0600, Alex Williamson wrote: > >>> We can't have a .remove callback that does nothing, this breaks > >>> removing the device while it's in use. Once we have the > >>> vfio_unregister_group_dev() fix below, we'll block until the device is > >>> unused, at which point vgpu->attached becomes false. Unless I'm > >>> missing something, I think we should also follow-up with a patch to > >>> remove that bogus warn-on branch, right? Thanks, > >> > >> Yes, looks right to me. > >> > >> I question all the logical arround attached, where is the locking? > > > > Zhenyu, Zhi, Kevin, > > > > Could someone please take a look at use of vgpu->attached in the GVT-g > > driver? It's use in intel_vgpu_remove() is bogus, the .release > > callback needs to use vfio_unregister_group_dev() to wait for the > > device to be unused. The WARN_ON/return here breaks all future use of > > the device. I assume @attached has something to do with the page table > > interface with KVM, but it all looks racy anyway. > > > Thanks for pointing this out. > > It was introduced in the GVT-g refactor patch series and Christoph might > not want to touch the vgpu->released while he needed a new state. > > I dig it a bit. vgpu->attached would be used for preventing multiple open > on a single vGPU and indicate the kvm_get_kvm() has been done. vfio core already ensures that .open_device() is called only once: vfio_device_open() { ... mutex_lock(&device->dev_set->lock); device->open_count++; if (device->open_count == 1) { ... if (device->ops->open_device) { ret = device->ops->open_device(device); ... } > vgpu->released was to prevent the release before close, which is now > handled by the vfio_device_*. > > What I would like to do are: > 1) Remove the vgpu->released. 2) Use alock to protect vgpu->attached. > > After those were solved, the WARN_ON/return in the intel_vgpu_remove() > should be safely removed as the .release will be called after .close_device > deceases the vfio_device->refcnt to zero. > > Thanks, > Zhi. > > > Also, whatever purpose vgpu->released served looks unnecessary now. > > Thanks, > > > > Alex > >