Re: [bugzilla-daemon@xxxxxxxxxx: [Bug 219619] New: vfio-pci: screen graphics artifacts after 6.12 kernel upgrade]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 31.12.24 02:27, Alex Williamson wrote:
On Mon, 30 Dec 2024 21:03:30 +0000
Precific <precification@xxxxxxxxx> wrote:

In my case, commenting out (1) the huge_fault callback assignment from
f9e54c3a2f5b suffices for GPU initialization in the guest, even if (2)
the 'install everything' loop is still removed.

I have uploaded host kernel logs with vfio-pci-core debugging enabled
(one log with stock sources, one large log with vfio-pci-core's
huge_fault handler patched out):
https://bugzilla.kernel.org/show_bug.cgi?id=219619#c1
I'm not sure if the logs of handled faults say much about what
specifically goes wrong here, though.

The dmesg portion attached to my mail is of a Linux guest failing to
initialize the GPU (BAR 0 size 16GB with 12GB of VRAM).

Thanks for the logs with debugging enabled.  Would you be able to
repeat the test with QEMU 9.2?  There's a patch in there that aligns
the mmaps, which should avoid mixing 1G and 2MB pages for huge faults.
With this you should only see order 18 mappings for BAR0.

Also, in a different direction, it would be interesting to run tests
disabling 1G huge pages and 2MB huge pages independently.  The
following would disable 1G pages:

diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 1ab58da9f38a..dd3b748f9d33 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1684,7 +1684,7 @@ static vm_fault_t vfio_pci_mmap_huge_fault(struct vm_fault *vmf,
  							     PFN_DEV), false);
  		break;
  #endif
-#ifdef CONFIG_ARCH_SUPPORTS_PUD_PFNMAP
+#if 0
  	case PUD_ORDER:
  		ret = vmf_insert_pfn_pud(vmf, __pfn_to_pfn_t(pfn + pgoff,
  							     PFN_DEV), false);

This should disable 2M pages:

diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 1ab58da9f38a..d7dd359e19bb 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1678,7 +1678,7 @@ static vm_fault_t vfio_pci_mmap_huge_fault(struct vm_fault *vmf,
  	case 0:
  		ret = vmf_insert_pfn(vma, vmf->address, pfn + pgoff);
  		break;
-#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
+#if 0
  	case PMD_ORDER:
  		ret = vmf_insert_pfn_pmd(vmf, __pfn_to_pfn_t(pfn + pgoff,
  							     PFN_DEV), false);

And applying both together should be functionally equivalent to
pre-v6.12.  Thanks,

Alex


Logs with QEMU 9.1.2 vs. 9.2.0, all huge_page sizes/1G only/2M only: https://bugzilla.kernel.org/show_bug.cgi?id=219619#c3

You're right, I was still using QEMU 9.1.2. With 9.2.0, the passed-through GPU works fine indeed with both Linux and Windows guests.

The huge_fault calls are aligned nicely with QEMU 9.2.0. Only the lower 16MB of BAR 0 see repeated calls at 2M/4K page sizes but no misalignment. The QEMU 9.1.2 'stock' log shows a misalignment with 1G faults (order 18), e.g., huge_faulting 0x40000 pages at page offset 0 and later 0x4000. I'm not sure if that is a problem, or if the offsets are simply masked off to the correct alignment. QEMU 9.1.2 also works with 1G pages disabled. Perhaps coincidentally, the offsets are aligned properly for order 9 (0x200 'page offset' increments) from what I've seen.

Thanks,
Precific




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux