[Bug 219010] New: [REGRESSION][VFIO] kernel 6.9.7 causing qemu crash because of "Collect hot-reset devices to local buffer"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://bugzilla.kernel.org/show_bug.cgi?id=219010

            Bug ID: 219010
           Summary: [REGRESSION][VFIO] kernel 6.9.7 causing qemu crash
                    because of "Collect hot-reset devices to local buffer"
           Product: Virtualization
           Version: unspecified
          Hardware: All
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P3
         Component: kvm
          Assignee: virtualization_kvm@xxxxxxxxxxxxxxxxxxxx
          Reporter: zaltys@xxxxxxxxx
        Regression: No

One of my virtual machines using PCI device passthrough (vfio) stopped working
on OpenSuse Tumbleweed since kernel 6.9.7. Qemu 9.0.1 complains:

qemu-system-x86_64: vfio: hot reset info failed: No space left on device
qemu-system-x86_64: GLib: ../glib/gmem.c:177: failed to allocate
18446744068411217972 bytes

and then coredumps. Qemu backtrace shows vfio_pci_get_pci_hot_reset_info()
being the last qemu function being called.

Reverting kernel 6.9.7 commit 9313244c26f3792daa86f3a18cc3bd5ad60310e0
(upstream f6944d4a0b87c16bc34ae589169e1ded3d4db08e) - "vfio/pci: Collect
hot-reset devices to local buffer" fixes the problem. As I understand, that was
backported to 6.9.7 from 6.10 tree.

Upon more throughout analysis I pinpointed that crash is happening because of
one specific passed device: sound card of Asus B650 Creator motherboard. VM
starts on 6.9.7 if I remove this sound card from it. I think the important bit
is this card being VF of device which does not report support for FLR:

15:00.0 | iommu group 28 | Phoenix PCIe Dummy Function <-- not passed to VM, no
driver, reset method: pm bus 
15:00.2 | iommu group 29 | Encryption controller (PSP/CCP) <-- ccp driver
15:00.3 | iommu group 30 | USB controller <-- xhci_hcd driver
15:00.4 | iommu group 31 | USB controller <-- xhci_hcd driver
15:00.6 | iommu group 32 | HD Audio Controller <-- sound card passed to VM

After reverting the above mentioned commit, qemu complains:

vfio: Cannot reset device 0000:15:00.6, depends on group 28 which is not owned

exactly the same as before 6.9.7 and VM starts with that sound card passed.

This might be an unsupported configuration, but qemu crashing with 6.9.7 also
feels like kernel might be breaking userspace by handling/mishandling this
differently, especially with minor version change.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux