Hi, Am 29.12.23 um 06:46 schrieb Steven Haigh: > On 29/12/23 00:18, Lukas Wunner wrote: >> On Thu, Dec 28, 2023 at 01:03:10PM +1100, Steven Haigh wrote: >>> At some point in kernel 6.6.x, SCSI hotplug in qemu VMs broke. This was >>> mostly fixed in the following commit to release 6.6.8: >>> commit 5cc8d88a1b94b900fd74abda744c29ff5845430b >>> Author: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> >>> Date: Thu Dec 14 09:08:56 2023 -0600 >>> Revert "PCI: acpiphp: Reassign resources on bridge if necessary" >>> >>> After this commit, the SCSI block device is hotplugged correctly, and >>> a device node as /dev/sdX appears within the qemu VM. >>> >>> New problem: >>> >>> When the same SCSI block device is hot-unplugged, the QEMU KVM >>> process will >>> spin at 100% CPU usage. The guest shows no CPU being used via top, >>> but the >>> host will continue to spin in the KVM thread until the VM is rebooted. >> >> Find out the PID of the qemu process on the host, then cat >> /proc/$PID/stack >> to see where the CPU time is spent. > > Thanks for the tip - I'll certainly do that. > > Annoyingly, since I posted this report originally, then adding in a new > report to the kernel.org lists in this, I have been unable to reproduce > this problem. I have successfully done ~22 scsi hotplug / remove cycles > and none resulted in reproducing the issue. > > Kernel versions are still the same on both proxmox host and the Fedora > guest - however I see an update on the host of the qemu-kvm packages in > Proxmox. The proxmox host hasn't even been rebooted in this time. > > I wonder if the initial revert included in 6.6.8 fixed the main problem, > and the later update to qemu-kvm packages on the proxmox host followed > by the last reboot of the VM with the new KVM package sorted the second > issue. > > Seeing as I can no longer reproduce this reliably - whereas it was 100% > reproducible prior, maybe I'm now chasing ghosts. > That sounds likely. Version pve-qemu-kvm=8.1.2-5 had a regression where an IO thread in QEMU could start spinning after a drain (which happens during hotplug on the QEMU side). It was introduced by an attempted fix for a much rarer problem [0] and was reverted in pve-qemu-kvm=8.1.2-6 [1]. A proper fix is still being worked on [2]. [0]: https://git.proxmox.com/?p=pve-qemu.git;a=commit;h=6b7c1815e1c89cb66ff48fbba6da69fe6d254630 [1]: https://git.proxmox.com/?p=pve-qemu.git;a=commit;h=2a49e667bae33f2a5c6ba6b59a0cd26387f73a27 [2]: https://lists.nongnu.org/archive/html/qemu-devel/2023-12/msg01900.html Best Regards, Fiona > I'll still continue to monitor - as I normally do this SCSI hotplug ~3 > times per week doing backups to different external HDDs - so if I do > observe it again, I'll grab the stack and reply to this thread again > with what I can find. > > Until then, I don't want to waste other peoples time also chasing ghosts :) >