[+cc Bagas] On Thu, May 25, 2023 at 01:33:27PM -0700, Patel, Nirmal wrote: > On 5/25/2023 1:19 PM, Patel, Nirmal wrote: > >> On Tue, 23 May 2023 12:21:25 -0500 > >> Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > >>> On Mon, May 22, 2023 at 04:32:03PM +0000, bugzilla-daemon@xxxxxxxxxx wrote: > >>>> https://bugzilla.kernel.org/show_bug.cgi?id=217472 > >>>> ... > >>>> Created attachment 304301 > >>>> --> > >>>> https://bugzilla.kernel.org/attachment.cgi?id=304301&action=edit > >>>> Rhel9.1_Guest_dmesg > >>>> > >>>> Issue: > >>>> NVMe Drives are still present after performing hotplug in guest > >>>> OS. We have tested with different combination of OSes, drives > >>>> and Hypervisor. The issue is present across all the OSes. > >>> > >>> Maybe attach the specific commands to reproduce the problem in one of > >>> these scenarios to the bugzilla? I'm a virtualization noob, so I > >>> can't visualize all the usual pieces. > >>> > >>>> The following patch was added to honor ACPI _OSC values set by BIOS > >>>> and the patch helped to bring the issue out in VM/ Guest OS. > >>>> > >>>> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/comm > >>>> it/drivers/pci/controller/vmd.c?id=04b12ef163d10e348db664900ae7f611b > >>>> 83c7a0e > >>>> > >>>> > >>>> I also compared the values of the parameters in the patch in > >>>> Host and Guest OS. The parameters with different values in > >>>> Host and Guest OS are: > >>>> > >>>> native_pcie_hotplug > >>>> native_shpc_hotplug > >>>> native_aer > >>>> native_ltr > >>>> > >>>> i.e. > >>>> value of native_pcie_hotplug in Host OS is 1. > >>>> value of native_pcie_hotplug in Guest OS is 0. > >>>> > >>>> I am not sure why "native_pcie_hotplug" is changed to 0 in guest. > >>>> Isn't it OSC_ managed parameter? If that is the case, it should have > >>>> same value in Host and Guest OS. > >>> > >>> From your dmesg: > >>> > >>> DMI: Red Hat KVM/RHEL, BIOS 1.16.0-4.el9 04/01/2014 > >>> _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3] > >>> _OSC: platform does not support [PCIeHotplug LTR DPC] > >>> _OSC: OS now controls [SHPCHotplug PME AER PCIeCapability] > >>> acpiphp: Slot [0] registered > >>> virtio_blk virtio3: [vda] 62914560 512-byte logical blocks (32.2 > >>> GB/30.0 GiB) > >>> > >>> So the DMI ("KVM/RHEL ...") is the BIOS seen by the guest. Doesn't > >>> mean anything to me, but the KVM folks would know about it. In any > >>> event, the guest BIOS is different from the host BIOS, so I'm not > >>> surprised that _OSC is different. > >> > >> Right, the premise of the issue that guest and host should have > >> the same OSC features is flawed. The guest is a virtual machine > >> that can present an entirely different feature set from the host. > >> A software hotplug on the guest can occur without any bearing to > >> the slot status on the host. > >> > >>> That guest BIOS _OSC declined to grant control of PCIe native > >>> hotplug to the guest OS, so the guest will use acpiphp (not > >>> pciehp, which would be used if native_pcie_hotplug were set). > >>> > >>> The dmesg doesn't mention the nvme driver. Are you using > >>> something like virtio_blk with qemu pointed at an NVMe drive? > >>> And you hot-remove the NVMe device, but the guest OS thinks it's > >>> still present? > >>> > >>> Since the guest is using acpiphp, I would think a hot-remove of > >>> a host NVMe device should be noticed by qemu and turned into an > >>> ACPI notification that the guest OS would consume. But I don't > >>> know how those connections work. > >> > >> If vfio-pci is involved, a cooperative hot-unplug will attempt to > >> unbind the host driver, which triggers a device request through > >> vfio, which is ultimately seen as a hotplug eject operation by > >> the guest. Surprise hotplugs of assigned devices are not > >> supported. There's not enough info in the bz to speculate how > >> this VM is wired or what actions are taken. Thanks, > > Thanks Bjorn and Alex for quick response. > I agree with the analysis about guest BIOS not giving control of > PCIe native hotplug to guest OS. Can I back up and try to understand the problem better? I'm sure I'm asking dumb questions, so please correct me: - Can you add more details in the bz about what you're doing and what is failing? - I have the impression that the hotplug worked before 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC on PCIe features") but fails after? - Can you attach dmesg logs from before and after 04b12ef163d1? - What sort of virtualized guest is this? qemu? - How is the NVMe drive passed to the guest? vfio-pci? - Apparently the problem is with a hot-remove in the guest? How are you doing this? Sysfs "remove" file? qemu "device_del"? - I assume this hot-remove is only from the *guest* and there's no hotplug event for the *host*? > Adding some background about the patch f611b83c7a0e PCI: vmd: Honor > ACPI _OSC on PCIe features. Tangent, "f611b83c7a0e" is not a valid SHA1, so I was lost for a minute :) I guess you're referring to 04b12ef163d10e348db664900ae7f611b83c7a0e, where f611b83c7a0e is at the *end* of the SHA1. You can abbreviate it, but you have to quote the *beginning*, not the end. E.g., the conventional style would be 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC on PCIe features"). Bjorn