Re: Need to scale down the time to release multiple vfio devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 6 Nov 2018 11:22:46 +0100
Christian Ehrhardt <christian.ehrhardt@xxxxxxxxxxxxx> wrote:

> [...]
> 
> >
> > One more try, lightly tested, hopefully the best of both worlds with a
> > mutex per bus/slot so that multiple buses/slots can be reset in
> > parallel, but the devices affected by those resets, per bus/slot, are
> > serialized.  Now my multi-function device gets a bus reset with the
> > parallel unit tests.  Thanks,
> >
> > Alex
> >
> > commit dc05ffc2f5a666a68b547f326e4c6061abf9f89a
> > Author: Alex Williamson <alex.williamson@xxxxxxxxxx>
> > Date:   Wed Oct 31 13:22:33 2018 -0600
> >
> >     vfio/pci: Parallelize device open and release  
> 
> Hi Alex,
> thanks for your thoughts and patches on this.
> I'm glad you were able to come up avoiding the regression for
> functions in the same group.
> I built and tested your second patch and it works fine with my initial case.
> 
> As with my RFC the threaded and concurrent process based close actions
> are now fast - giving userspace a chance to make it faster.
> Kill signal/Process exit still is serialized by the __fput cleanup
> from the work queue - as expected.
> Your patch is even slightly faster (about half a second) than my
> initial suggestion.

I'd guess that's because your patch only released the global lock
around reset, so large parts of the device release were still
serialized while now they run entirely in parallel.

> Further for some more light tests I ran some general vfio/passthrough
> tests which all still worked as they did before.
> 
> Tested-by: Christian Ehrhardt <christian.ehrhardt@xxxxxxxxxxxxx>

Thanks, I'll do more testing myself an formally post it with your T-b.
 
> You also added some debug code to track the retval of vfio_pci_try_bus_reset.
> Therefore I thought it might be worth to mention that in my case this
> reports for every vfio detach:
>    [  327.544291] vfio_pci_try_bus_reset: -22
> That would be EINVAL right?
> I didn't track down where that came from, please let me know if that
> would be important to you.

I think I understand it, your device is reset via
pci_try_reset_function() because it's a singleton device on the bus
(your patch only parallelized across this call) while
vfio_pci_try_bus_reset() is looking for cases where releasing a device
allows a bus reset to occur when there are other "dirty" devices on the
bus.  You have no such devices on the bus, so we exit the function with
the initial errno value.  It's not an error in your case, but for my
test case with multiple devices on the bus, it allows me to verify the
per-bus serialization and that the last device through does trigger a
bus reset.  Thanks,

Alex



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux