Re: Locking between vfio hot-remove and pci sysfs sriov_numvfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 8 Dec 2023 14:01:57 -0400
Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:

> On Fri, Dec 08, 2023 at 05:59:17PM +0000, Jim Harris wrote:
> > On Fri, Dec 08, 2023 at 01:41:09PM -0400, Jason Gunthorpe wrote:  
> > > On Fri, Dec 08, 2023 at 05:38:51PM +0000, Jim Harris wrote:  
> > > > On Thu, Dec 07, 2023 at 07:48:10PM -0400, Jason Gunthorpe wrote:  
> > > > > 
> > > > > The mechanism of waiting in remove for userspace is inherently flawed,
> > > > > it can never work fully correctly. :( I've hit this many times.
> > > > > 
> > > > > Upon remove VFIO should immediately remove itself and leave behind a
> > > > > non-functional file descriptor. Userspace should catch up eventually
> > > > > and see it is toast.  
> > > > 
> > > > One nice aspect of the current design is that vfio will leave the BARs
> > > > mapped until userspace releases the vfio handle. It avoids some rather
> > > > nasty hacks for handling SIGBUS errors in the fast path (i.e. writing
> > > > NVMe doorbells) where we cannot try to check for device removal on
> > > > every MMIO write. Would your proposal immediately yank the BARs, without
> > > > waiting for userspace to respond? This is mostly for my curiosity - SPDK
> > > > already has these hacks implemented, so I don't think it would be
> > > > affected by this kind of change in behavior.  
> > > 
> > > What we did in RDMA was map a dummy page to the BARs so the sigbus was
> > > avoided. But in that case RDMA knows the BAR memory is used only for
> > > doorbell write so this is a reasonable thing to do.  
> > 
> > Yeah, this is exactly what SPDK (and DPDK) does today.  
> 
> To be clear, I mean we did it in the kernel.
> 
> When the device driver is removed we zap all the VMAs and install a
> fault handler that installs the dummy page instead of SIGBUS
> 
> The application doesn't do anything, and this is how SPDK already will
> be supporting device hot unplug of the RDMA drivers.

But I think you can only do that in the kernel because you understand
the device uses those pages for doorbells and it's not a general
purpose solution, right?

Perhaps a variant driver could do something similar for NVMe devices
doorbell pages, but a device agnostic driver like vfio-pci would need
to SIGBUS on access or else we risk significant data integrity issues.
Thanks,

Alex





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux