On Fri, 18 Jun 2021 10:43:07 -0400 Laine Stump <laine@xxxxxxxxxx> wrote: > On 6/16/21 4:15 PM, Daniel Henrique Barboza wrote: > > > > > > On 6/9/21 4:38 PM, Manish Mishra wrote: > >> Hi Everyone, > >> > >> We want to add extra options to device xml to skip reattach of pci > >> passthrough devices. Following is xml format for pci passthrough > >> devices added to domain as of now. > >> > >> <hostdev mode='subsystem' type='pci' managed='yes'> > >> > >> <source> > >> > >> <address domain='0x0000' bus='0x00' slot='0x1a' function='0x7'/> > >> > >> </source> > >> > >> </hostdev> > >> > >> When we pass managed=’yes’ flag through xml, libvirt takes > >> responsibility of detaching device on domain(guest VM) start and > >> reattaching on domain shutdown. We observed some issues where guest VM > >> shutdown may take long time, blocked for reattach operation on pci > >> passthrough device. As domain lock is held during this time it also > >> makes libvirt mostly inactive as it blocks even basic operations like > >> (virsh list). Reattaching of device to host can block due to reasons > >> like buggy driver or initialization of device itself can take long > >> time in some cases. > > > > I am more interested in hearing about the problem with this faulty buggy > > driver holding domain lock during device reattach and compromising 'virsh' > > operations, and see if there's something to do to mitigate that, instead > > of creating a XML workaround for a driver problem. > > > >> > >> We want to pass following extra options to resolve this: > >> > >> 1. *skipReAttach*(optional flag) > >> > >> In some cases we do not need to reattach device to host as it may be > >> reserved only for guests, with this flag we can skip reattach > >> operation on host. We do not want to modify managed flag to avoid > >> regression, so thinking of adding new optional flag. > >> > >> 2. *reAttachDriverName*(optional flag) > >> > >> Name of driver to which we want to attach instead of default, to avoid > >> reattaching to buggy driver. Currently libvirt asks host to auto > >> selects driver for device. > >> > >> Yes we can use managed=’no’ but in that case user has to take > >> responsibility of detaching device before starting domain which we do > >> not want. Please let us know your views on this. > > > > The case you mentioned above, "we do not need to reattach device to host > > as it may be reserved only for guests", is one of the most common uses > > we have for managed='no' AFAIK. The user/sysadm must detach the device > > from the host, but it's only one time. After that the device can remain > > detached from the host, and guests can use it freely as long as you > > don't reboot the host (or reattach the device back). This scenario > > you described fit the managed='no' mechanics fine IMO. > > > > If you want to automate the detach process, you can use a Libvirt QEMU > > hook (/etc/libvirt/hooks/qemu) to make the device detach when starting > > the domain, in case the device isn't already detached. Note that > > this has the same effect of the "skipReAttach" option you proposed. > > > > Making a design around faulty drivers isn't ideal. If the driver you're > > using starts to have problems with the detach operation as well, > > 'skipReAttach' > > will do you no good. You'll have to fall back to 'managed=no' to circumvent > > that. > > > > Even if we discard the motivation, I'm not sure about the utility of having > > more forms of PCI assignment management (e.g > > managed=yes|no|detach|reattach). > > managed=yes|no seems to cover most use cases where the device driver works > > properly. > > > > > > Laine, what do you think? > > > I have a vague memory of someone (may even have been me) proposing > something similar several years ago, and the idea was shot down. I don't > remember the exact context or naming, but the general idea was to have > something like managed='detach-only' in order to have the advantage of > all configuration being within libvirt, but eliminating the potential > bad behavior associated with repeated re-binding of devices to drivers. > I unfortunately also don't recall the reason the idea was nixed. Dan or > Alex - do either of you have any memory of this? Sounds very vaguely familiar, but not enough to remember the arguments. > As for myself, 1) I agree with Daniel's suggestion that it is important > to find the root cause of the long delay rather than just covering it up > with another obscure option that will need to be re-discovered by anyone > who encounters the problem, and 2) every new bit that we add in makes > the code more complex and so more prone to errors, and also makes > configuration more complex and also more prone to errors. So while new > options like this could be useful, they could also be a net loss (or > not, it's hard to know without actually doing it, but once it's done it > can't be "un-done" :-)) AFAIK, the generally recommendation for "devices used only for assignment" is to use driverctl to automatically attach these devices to vfio-pci on boot/attach and managed="no" in the xml. It might not be a bad idea if libvirt were to fork a thread for rebinding a device to the host, but xml workarounds don't seem well justified here. Thanks, Alex