On 10/28/16 13:28, Henning Schild wrote: > Hey, > > i am running an unusual setup where i assign pci devices behind the > back of libvirt. I have two options to do that: > 1. a wrapper script for qemu that takes care of suid-root and appends > arguments for pci-assign > 2. virsh qemu-monitor-command ... 'device_add pci-assign...' > > I know i should probably not be doing this, it is a workaround to > introduce fine-grained pci-assignment in an openstack setup, where > vendor and device id are not enough to pick the right device for a vm. (1) The libvirt domain XML identifies the host PCI device to assign by full PCI address (see the <source> element: <http://libvirt.org/formatdomain.html#elementsHostDev>); it does not filter with vendor/device ID. So, I believe your comment refers to the pci-stub host kernel driver not being flexible enough for binding vs. not binding different instances of the same vendor/device ID. If that's the case, would you be helped by the following host kernel patch? [PATCH] PCI: pci-stub: accept exceptions to the ID- and class-based matching <http://www.spinics.net/lists/linux-pci/msg55497.html> (2) Is there any reason (other than (1)) that you are using the legacy / deprecated pci-assign method, rather than VFIO? I suggest to evaluate whether the "pci-stub.except=..." kernel parameter helped your use case, and if (consequently) you could move to a fully libvirt + VFIO based config. Thanks Laszlo > > In both cases qemu will crash with the following output: > >> qemu: hardware error: pci read failed, ret = 0 errno = 22 > > followed by the usual machine state dump. With strace i found it to be > a failing read on the config space file of my device. > /sys/bus/pci/devices/0000:xx:xx.x/config > A few reads out of that file succeeded, as well as accesses on vendor > etc. > > Manually launching a qemu with the pci-assign works without a problem, > so i "blame" libvirt and the cgroup environment the qemu ends up in. > So i put a bash into the exact same cgroup setup - next to a running > qemu, expecting a dd or hexdump on the config-space file to fail. But > from that bash i can read the file without a problem. > > Has anyone seen that problem before? Right now i do not know what i > am missing, maybe qemu is hitting some limits configured for the > cgroups or whatever. I can not use pci-assign from libvirt, but if i > did would it configure cgroups in a different way or relax some limits? > > What would be a good next step to debug that? Right now i am looking at > kernel event traces, but the machine is pretty big and so is the trace. > > That assignment used to work and i do not know how it broke, i have > tried combinations of several kernels, versions of libvirt and qemu. > (kernel 3.18 and 4.4, libvirt 1.3.2 and 2.0.0, and qemu 2.2.1 and 2.7) > All combinations show the same problem, even the ones that work on > other machines. So when it comes to software versions the problem could > well be caused by a software update of another component, that i > got with the package manager and did not compile myself. It is a debian > 8.6 with all recent updates installed. My guess would be that systemd > could have an influence on cgroups or limits causing such a problem. > > regards, > Henning > -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list