On Mon, 2013-05-20 at 23:11 +0200, Knut Omang wrote: > On Sun, 2013-05-19 at 22:15 -0600, Alex Williamson wrote: > > On Sun, 2013-05-19 at 17:35 +0200, Knut Omang wrote: > > > On Mon, 2013-05-13 at 16:23 -0600, Alex Williamson wrote: > > > > On Mon, 2013-05-13 at 22:55 +0200, Knut Omang wrote: > > > > > Hi all, > > > > > > > > > > Perfect timing from my perspective, thanks Alex! > > > > > > > > > > I spent the better part of the weekend testing your branches on a new system > > > > > I just put together for this purpose, results below.. > > > > > > > > > > On Fri, 2013-05-03 at 16:56 -0600, Alex Williamson wrote: > > > > > ... > > > > > > git://github.com/awilliam/linux-vfio.git vfio-vga-reset > > > > > > git://github.com/awilliam/qemu-vfio.git vfio-vga-reset > > > > > > > > > > System setup: > > > > > > > > > > - Fedora 18 on > > > > > - Gigabyte Z77X-UD5H motherboard > > > > > - Intel Core i7 3770 (Ivy bridge w/integrated graphics) > > > > > - 2 discrete graphics cards: > > > > > > > > > > lspci | egrep 'VGA|Audio' > > > > > 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor Graphics Controller (rev 09) > > > > > 00:1b.0 Audio device: Intel Corporation 7 Series/C210 Series Chipset Family High Definition Audio Controller (rev 04) > > > > > 01:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Caicos [Radeon HD 6450] > > > > > 01:00.1 Audio device: Advanced Micro Devices [AMD] nee ATI Caicos HDMI Audio [Radeon HD 6400 Series] > > > > > 02:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Cape Verde PRO [Radeon HD 7700 Series] > > > > > 02:00.1 Audio device: Advanced Micro Devices [AMD] nee ATI Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series] > > > > > > > > > > Short summary: > > > > > > > > > > - Once I got past a few time consuming obstacles explained below > > > > > - the graphics part of the graphics/hdmi audio passthrough seems to work perfect > > > > > on both discrete graphics cards > > > > > (though so far only one at at time and with some minor issues, see below) > > > > > - no success with the hdmi audio yet (ideas for further investigation appreciated!) > > > > > > > > I've had hdmi audio working with an HD7850, but only in Windows (7) and > > > > it was using legacy interrupts for some reason instead of MSI. I wonder > > > > if Liux guests might work with snd_hda_intel.enable_msi=0. I'm not sure > > > > what's wrong with MSI, but it seems to be new with the PCI bus reset > > > > support. > > > > > > In my first tries, Windows were just using a generic > > > VGA driver, which still seems to work perfect with reboots and everything > > > and in full screen resolution (1920x1200). > > > However after installing the Catalyst AMD driver stack, upon boot > > > Windows 7 now consequently get a BSOD from the graphics driver > > > with the message: > > > > > > "Attempt to reset the display driver and recover from timeout failed" > > > - a picture of the BSOD screen attached. > > > > I've seen that BSOD before, but I don't know how to reproduce it. It > > seems like I haven't seen it with the PCI bus reset code. I'm running > > version 13.1 of the catalyst driver, you? > > I first tried with the install CD that came with the card - v.13-045 > then upgraded to the latest from AMD, catalyst v.13.4 which appears to > be driver v.12.104 - similar behaviour for both. This was with a plain > Windows 7 install from my SP1 DVD. > > With most recommended windows updates and the latest catalyst driver, > the BSOD is gone but instead I see the initial VGA boot screen and the > windows logo, then syncs but no display and then reboot into recovery > mode. (If I try all updates, Windows seems never to be able to recover > from the last reboot) > > I have tried without kvm and also with vnc or spice graphics in addition > but in those cases it seems Windows is not able to allocate MMIO > resources for both adapters so I haven't been able to test the catalyst > driver as a secondary windows display. > > > > I attach the corresponding vfio log where I added some timing code to > > > make it easier to see when the BSOD happens (with 2 seconds of silence > > > in the log before the VM reboots, I believe this is at 09:28:32-34 in > > > the log. > > > > Yep, looks like that's where windows starts the BSOD. > > > > > Similar behaviour both just after reboot/power cycle of the host and > > > subsequent VM boot attempts. > > > > > > This is still with the HD7700 as passed through device, but after a > > > motherboard firmware upgrade (to F14) which did not seem to affect the > > > observed behaviour on Windows prior to Catalyst install or with Linux > > > guest, neither did it fix the bug in selecting primary devices as I > > > was hoping for. > > > > > > Let me know if you have ideas for further debugging this, > > > > I don't have any great ideas since I don't know how to reproduce the > > timeout. Double/triple check that you're using the correct > > vfio-vga-reset branches in both qemu and kernel > > > > # grep VFIO_DEVICE_PCI_BUS_RESET qemu.git/hw/misc/vfio.c > > # grep VFIO_DEVICE_PCI_BUS_RESET linux.git/drivers/vfio/pci/vfio_pci.c > > [Matches in both..] > I do believe I have used the right branches all along. > > > I didn't see anything telling in your DMAR either. The system seems to > > have just one DRHD that includes everything, so I'm not sure why you saw > > any behavior change from igfx_off. Thanks, > > After the firmware upgrade, I tried again with the integrated graphics > enabled, this time with more success - I am now able to get a GUI fedora > console on the integrated graphics, but see some colorful artifacts > there during the VGA startup on one of the Radeon cards, which goes away > with a toggle to another console and back. > > Seems I have slightly mislead you with the DMAR table - sorry about that > - the table I posted was with the igfx disabled, with the igfx enabled I > see one more hardware unit dedicated to the igfx if I am able to > interpret it right (attached) I noticed this warning in the host log - I suppose it is unrelated but thought I'd mention it just in case there is some side effect I do not understand here: [ 0.538124] IOMMU: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff] [ 0.538619] PCI-DMA: Intel(R) Virtualization Technology for Directed I/O [ 0.538676] ------------[ cut here ]------------ [ 0.538681] WARNING: at drivers/pci/search.c:46 pci_find_upstream_pcie_bridge+0x58/0x80() [ 0.538683] Hardware name: To be filled by O.E.M. [ 0.538685] Modules linked in: [ 0.538687] Pid: 1, comm: swapper/0 Not tainted 3.9.0+ #1 [ 0.538689] Call Trace: [ 0.538694] [<ffffffff8105ed2f>] warn_slowpath_common+0x7f/0xc0 [ 0.538697] [<ffffffff8105ed8a>] warn_slowpath_null+0x1a/0x20 [ 0.538699] [<ffffffff8132dc28>] pci_find_upstream_pcie_bridge+0x58/0x80 [ 0.538703] [<ffffffff8152e26b>] intel_iommu_add_device+0x4b/0x1f0 [ 0.538706] [<ffffffff81525b30>] ? bus_set_iommu+0x60/0x60 [ 0.538708] [<ffffffff81525b63>] add_iommu_group+0x33/0x60 [ 0.538712] [<ffffffff813f38fd>] bus_for_each_dev+0x5d/0xa0 [ 0.538714] [<ffffffff81525b1b>] bus_set_iommu+0x4b/0x60 [ 0.538718] [<ffffffff81d47d61>] intel_iommu_init+0xa72/0xb9a [ 0.538722] [<ffffffff81d0db94>] ? memblock_find_dma_reserve+0x13d/0x13d [ 0.538724] [<ffffffff81d0dba7>] pci_iommu_init+0x13/0x3e [ 0.538727] [<ffffffff8100215a>] do_one_initcall+0x12a/0x180 [ 0.538730] [<ffffffff81d0603b>] kernel_init_freeable+0x150/0x1df [ 0.538732] [<ffffffff81d0588d>] ? do_early_param+0x8c/0x8c [ 0.538736] [<ffffffff81646580>] ? rest_init+0x80/0x80 [ 0.538738] [<ffffffff8164658e>] kernel_init+0xe/0xf0 [ 0.538742] [<ffffffff8166af6c>] ret_from_fork+0x7c/0xb0 [ 0.538744] [<ffffffff81646580>] ? rest_init+0x80/0x80 [ 0.538749] ---[ end trace f4e8b5168095f9c1 ]--- > Both the HD7700 and the HD6450 behave very similar and both still starts > and displays Windows fine if I disable the Catalyst driver. > > Knut > > > Alex > > > > > > > - Contrary to deniv@xxxxxxxxxxx I had no success with using pci-assign for VGA > > > > > with a standard fedora 18 kernel and fairly recent qemu, nor with your branches, > > > > > > > > > > Details: > > > > > > > > > > - I started off with the required kernel parameter 'intel_iommu=on' + necessary parameters for disabling radeon > > > > > (radeon.modeset=0 rd.driver.blacklist=radeon) using the integrated graphics as primary display > > > > > - this caused the system to freeze (with color artifacts on the console) > > > > > > > > > > - In my naivity and because of the "i" in ifgx I tried both with > > > > > 'intel_iommu=ifgx_off' and then 'intel_iommu=on,igfx_off' > > > > > and a full set of combinations of vfio, cards, kernels and pci-assign before I suspected > > > > > that iommu support was turned off for **all** graphics cards with igfx_off > > > > > > > > I'm not sure why this is, looks like the code only tries to turn it off > > > > when only graphics is under the remapping device. We'd probably need to > > > > see the DMAR to know more (/sys/firmware/acpi/tables/DMAR). > > > > > > > > > - The solution was to have integrated graphics turned off in the BIOS, and 'intel_iommu=on': > > > > > > > > > > - iommu groups: > > > > > > > > > > ls -l /sys/bus/pci/devices/0000:01:00.0/iommu_group/devices > > > > > total 0 > > > > > lrwxrwxrwx 1 root root 0 May 11 08:55 0000:00:01.0 -> ../../../../devices/pci0000:00/0000:00:01.0 > > > > > lrwxrwxrwx 1 root root 0 May 11 08:55 0000:00:01.1 -> ../../../../devices/pci0000:00/0000:00:01.1 > > > > > lrwxrwxrwx 1 root root 0 May 11 08:55 0000:01:00.0 -> ../../../../devices/pci0000:00/0000:00:01.0/0000:01:00.0 > > > > > lrwxrwxrwx 1 root root 0 May 11 08:55 0000:01:00.1 -> ../../../../devices/pci0000:00/0000:00:01.0/0000:01:00.1 > > > > > lrwxrwxrwx 1 root root 0 May 11 08:55 0000:02:00.0 -> ../../../../devices/pci0000:00/0000:00:01.1/0000:02:00.0 > > > > > lrwxrwxrwx 1 root root 0 May 11 08:55 0000:02:00.1 -> ../../../../devices/pci0000:00/0000:00:01.1/0000:02:00.1 > > > > > > > > > > - eg. both the VGA/HDMI Audio pairs + the two root ports they are plugged into are in the same group: > > > > > > > > Ick. Intel has been pretty good about advertising ACS support on their > > > > root ports. I wonder if this is an oversight or if they are actually > > > > not isolated from each other. > > > > > > > > > # lspci -n > > > > > ... > > > > > 01:00.0 0300: 1002:683f > > > > > 01:00.1 0403: 1002:aab0 > > > > > 02:00.0 0300: 1002:6779 > > > > > 02:00.1 0403: 1002:aa98 > > > > > ... > > > > > > > > > > modprobe vfio_pci > > > > > echo 0000:01:00.1 > /sys/bus/pci/devices/0000\:01\:00.1/driver/unbind > > > > > echo 0000:02:00.1 > /sys/bus/pci/devices/0000\:02\:00.1/driver/unbind > > > > > echo 1002 683f > /sys/bus/pci/drivers/vfio-pci/new_id > > > > > echo 1002 aab0 > /sys/bus/pci/drivers/vfio-pci/new_id > > > > > echo 1002 6779 > /sys/bus/pci/drivers/vfio-pci/new_id > > > > > echo 1002 aa98 > /sys/bus/pci/drivers/vfio-pci/new_id > > > > > > > > > > # lsusb > > > > > ... > > > > > Bus 001 Device 008: ID 046d:c315 Logitech, Inc. Classic New Touch Keyboard > > > > > Bus 001 Device 004: ID 046d:c05b Logitech, Inc. M-U0004 810-001317 [B110 Optical USB Mouse] > > > > > ... > > > > > > > > > > - I also applied your suggested patch to the quirk function in VFIO (see below) > > > > > > > > > > - Here is a (trimmed for readability) command line I successfully used to boot from the Windows 7 install DVD, > > > > > notice the cd and disk device descriptions and the bus parameter - I struggled a while with that > > > > > until I came across a comment by Gerd Hoffmann here: https://bugzilla.redhat.com/show_bug.cgi?id=922670 (Thanks, Gerd!) > > > > > > > > > > > > > > > qemu-kvm -M q35 \ > > > > > -nodefconfig -readconfig $SRC/qemu/docs/q35-chipset.cfg \ > > > > > -device vfio-pci,host=2:00.0,x-vga=on,multifunction=on,bus=ich9-pcie-port-1,addr=0.0 \ > > > > > -device vfio-pci,host=2:00.1,bus=ich9-pcie-port-1,addr=0.1 \ > > > > > -L $SRC/seabios/out/ -L $SRC/qemu/pc-bios \ > > > > > -vga none -nographic -cpu host -rtc base=localtime -k no -m 8192 -smp 2 \ > > > > > -drive file=/dev/sr0,index=2,media=cdrom,id=cd \ > > > > > -drive file=ivm03.img,index=0,media=disk,id=ivm03 \ > > > > > -device ide-drive,drive=ivm03,bus=ide.0 \ > > > > > -device ide-cd,drive=cd,bus=ide.1 \ > > > > > -net nic,vlan=0,model=virtio -net tap,vlan=0 \ > > > > > -enable-kvm \ > > > > > -device usb-host,hostbus=1,hostaddr=8 \ > > > > > -device usb-host,hostbus=1,hostaddr=4 > > > > > > > > > > - Both the graphics card seemshould really support ACS on s to have a rom but only the HD6450 let itself to "scraping". > > > > > > > > Did you try scraping the HD6450 while the HD7700 was the boot VGA and > > > > vica versa? The boot VGA ROM is handled in a special way and what you > > > > really get is the shadow copy, which isn't what we want. > > > > > > > > > Anyway, supplying it to vfio did not seem to make any difference. > > > > > > > > > > find /sys -name rom > > > > > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/rom > > > > > /sys/devices/pci0000:00/0000:00:01.1/0000:02:00.0/rom > > > > > ... > > > > > > > > > > Some observations and remaining unresolved issues: > > > > > > > > > > - VFIO patch: > > > > > Initially (while still running with igfx_off) I observed exactly the same behaviour as deniv@xxxxxxxxxxx > > > > > reported a while ago: With vfio_pci debug enabled, vfio_pci ended up spinning with repeated calls to > > > > > vfio_ati_3c3_quirk_read and repeated logs: > > > > > vfio: vfio_vga_read(0x3c3, 1) = 0x0 > > > > > I patched up accordingly with > > > > > > > > > > > > > > > diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c > > > > > index da0e5f9..a361d06 100644 > > > > > --- a/hw/misc/vfio.c > > > > > +++ b/hw/misc/vfio.c > > > > > @@ -1291,7 +1291,7 @@ static uint64_t vfio_ati_3c3_quirk_read(void *opaque, > > > > > uint64_t data = vfio_vga_read(&vdev->vga.region[QEMU_PCI_VGA_IO_HI], > > > > > addr + quirk->data.base_offset, size); > > > > > > > > > > - if (data == quirk->data.address_match) { > > > > > + if (1 || data == quirk->data.address_match) { > > > > > data = vfio_pci_read_config(&vdev->pdev, quirk->data.address_val, size); > > > > > DPRINTF("%s(0x3c3, 1) = 0x%"PRIx64"\n", __func__, data); > > > > > } > > > > > > > > > > > > > > > This of course did not help much until I actually got the iommu > > > > > enabled for the radeons (similar "repeated patters" as deniv reported) > > > > > but what I have observed after I got it working is that if > > > > > I disable the patch above, things are not that well: the Fedora VM > > > > > comes up with VGA and the Fedora boot screen, then goes blank when > > > > > switching to X. > > > > > > > > Hmm, I think we'd probably have better luck making that unconditional > > > > until we have reason to do otherwise. > > > > > > > > > - The fact that the iommu group now extends across all my available graphics > > > > > devices now makes it difficult to get the radeon (or catalyst) driver use to > > > > > the other card since the vfio_pci driver needs to hold it. > > > > > Not a complete showstopper since the vesa driver comes up with 1024x768.. > > > > > Might it be a good idea to have an override option (exception list or similar?) > > > > > to allow the vfio_pci to be less restrictive about owning the whole group > > > > > - allow functionality over security in such case? This of course is further complicated > > > > > by the need for graphics drivers to be disabled/enabled already at the kernel prompt.. > > > > > > > > We have a quirk in the kernel that enables us to witelist devices, but > > > > yes, there is no flexibility in this w/o modifying the code and > > > > rebuilding. (see drivers/pci/quirks.c:pci_dev_acs_enabled and follow > > > > the example above w/ pci_dev_dma_source - function can just return 1) > > > > > > > > > - There seems to be a bug in the (version F8) UEFI BIOS on the motherboard, > > > > > The BIOS offers (undocumented) a full range of selections of which PCIe > > > > > (or PCIe 1x) graphics card to use as primary, but any other selection > > > > > than the first PCIe 16x slot has no effect and the motherboard reverts > > > > > to the first slot, so to be able to test both cards, I had to put the card under test > > > > > into the second (8x) PCIe slot. I am waiting for feedback from Gigabyte on possible > > > > > fixes for this in newer BIOSes. > > > > > > > > > > - The ultimate goal is to try to consolidate some older Windows desktops as "seats" > > > > > on the new system, using the discrete graphics with HDMI/Displayport audio. > > > > > With the HD7700 moved to the second PCIe slot I tested both Windows and > > > > > Linux guests to try to get some sound through the HDMI audio device. > > > > > Windows complains that no usable device is available. On Linux (Fedora 18, KDE desktop), > > > > > the system settings -> multimedia dialogue never opens up which seems to indicate that > > > > > PulseAudio has problems communicating with the passed through device (?), > > > > > any hints/pointers here appreciated. From the vfio log it seems at least > > > > > config space is accessed ok. > > > > > > > > > > - There also seems to be issues with radeon and intel_iommu=on - if I try > > > > > to enable modesetting and normal X support for the radeon cards, X fails to start. > > > > > > > > > > - It would be nice if the integrated graphics could be used as the host primary display - > > > > > I would be happy if someone has any hints as to if/how the ifgx_off option > > > > > could be extended/modified to only affect iommu operation on selected device(s), > > > > > if at all possible.. > > > > > > > > Let's see what we can discover from your DMAR. Also send along sudo > > > > lspci -vvv. Thanks, > > > > > > > > Alex > > > > > > > > > > > > > > > > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html