Re: FW: [Bug 217472] New: ACPI _OSC features have different values in Host OS and Guest OS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[+cc Bagas]

On Thu, May 25, 2023 at 01:33:27PM -0700, Patel, Nirmal wrote:
> On 5/25/2023 1:19 PM, Patel, Nirmal wrote:
> >> On Tue, 23 May 2023 12:21:25 -0500
> >> Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> >>> On Mon, May 22, 2023 at 04:32:03PM +0000, bugzilla-daemon@xxxxxxxxxx wrote:
> >>>> https://bugzilla.kernel.org/show_bug.cgi?id=217472
> >>>> ...  
> >>>> Created attachment 304301  
> >>>>   --> 
> >>>> https://bugzilla.kernel.org/attachment.cgi?id=304301&action=edit
> >>>> Rhel9.1_Guest_dmesg
> >>>>
> >>>> Issue:
> >>>> NVMe Drives are still present after performing hotplug in guest
> >>>> OS.  We have tested with different combination of OSes, drives
> >>>> and Hypervisor. The issue is present across all the OSes.   
> >>>
> >>> Maybe attach the specific commands to reproduce the problem in one of 
> >>> these scenarios to the bugzilla?  I'm a virtualization noob, so I 
> >>> can't visualize all the usual pieces.
> >>>
> >>>> The following patch was added to honor ACPI _OSC values set by BIOS 
> >>>> and the patch helped to bring the issue out in VM/ Guest OS.
> >>>>
> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/comm
> >>>> it/drivers/pci/controller/vmd.c?id=04b12ef163d10e348db664900ae7f611b
> >>>> 83c7a0e
> >>>>
> >>>>
> >>>> I also compared the values of the parameters in the patch in
> >>>> Host and Guest OS.  The parameters with different values in
> >>>> Host and Guest OS are:
> >>>>
> >>>> native_pcie_hotplug
> >>>> native_shpc_hotplug
> >>>> native_aer
> >>>> native_ltr
> >>>>
> >>>> i.e.
> >>>> value of native_pcie_hotplug in Host OS is 1.
> >>>> value of native_pcie_hotplug in Guest OS is 0.
> >>>>
> >>>> I am not sure why "native_pcie_hotplug" is changed to 0 in guest.
> >>>> Isn't it OSC_ managed parameter? If that is the case, it should have 
> >>>> same value in Host and Guest OS.
> >>>
> >>> From your dmesg:
> >>>  
> >>>   DMI: Red Hat KVM/RHEL, BIOS 1.16.0-4.el9 04/01/2014
> >>>   _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
> >>>   _OSC: platform does not support [PCIeHotplug LTR DPC]
> >>>   _OSC: OS now controls [SHPCHotplug PME AER PCIeCapability]
> >>>   acpiphp: Slot [0] registered
> >>>   virtio_blk virtio3: [vda] 62914560 512-byte logical blocks (32.2 
> >>> GB/30.0 GiB)
> >>>
> >>> So the DMI ("KVM/RHEL ...") is the BIOS seen by the guest.  Doesn't 
> >>> mean anything to me, but the KVM folks would know about it.  In any 
> >>> event, the guest BIOS is different from the host BIOS, so I'm not 
> >>> surprised that _OSC is different.
> >>
> >> Right, the premise of the issue that guest and host should have
> >> the same OSC features is flawed.  The guest is a virtual machine
> >> that can present an entirely different feature set from the host.
> >> A software hotplug on the guest can occur without any bearing to
> >> the slot status on the host.
> >>
> >>> That guest BIOS _OSC declined to grant control of PCIe native
> >>> hotplug to the guest OS, so the guest will use acpiphp (not
> >>> pciehp, which would be used if native_pcie_hotplug were set).
> >>>
> >>> The dmesg doesn't mention the nvme driver.  Are you using
> >>> something like virtio_blk with qemu pointed at an NVMe drive?
> >>> And you hot-remove the NVMe device, but the guest OS thinks it's
> >>> still present?
> >>>
> >>> Since the guest is using acpiphp, I would think a hot-remove of
> >>> a host NVMe device should be noticed by qemu and turned into an
> >>> ACPI notification that the guest OS would consume.  But I don't
> >>> know how those connections work.
> >>
> >> If vfio-pci is involved, a cooperative hot-unplug will attempt to
> >> unbind the host driver, which triggers a device request through
> >> vfio, which is ultimately seen as a hotplug eject operation by
> >> the guest.  Surprise hotplugs of assigned devices are not
> >> supported.  There's not enough info in the bz to speculate how
> >> this VM is wired or what actions are taken.  Thanks,
> 
> Thanks Bjorn and Alex for quick response.
> I agree with the analysis about guest BIOS not giving control of
> PCIe native hotplug to guest OS.

Can I back up and try to understand the problem better?  I'm sure I'm
asking dumb questions, so please correct me:

  - Can you add more details in the bz about what you're doing and
    what is failing?

  - I have the impression that the hotplug worked before 04b12ef163d1
    ("PCI: vmd: Honor ACPI _OSC on PCIe features") but fails after?

  - Can you attach dmesg logs from before and after 04b12ef163d1?

  - What sort of virtualized guest is this?  qemu?

  - How is the NVMe drive passed to the guest?  vfio-pci?

  - Apparently the problem is with a hot-remove in the guest?  How are
    you doing this?  Sysfs "remove" file?  qemu "device_del"?

  - I assume this hot-remove is only from the *guest* and there's no
    hotplug event for the *host*?

> Adding some background about the patch f611b83c7a0e PCI: vmd: Honor
> ACPI _OSC on PCIe features.

Tangent, "f611b83c7a0e" is not a valid SHA1, so I was lost for a minute :)
I guess you're referring to 04b12ef163d10e348db664900ae7f611b83c7a0e,
where f611b83c7a0e is at the *end* of the SHA1.  You can abbreviate
it, but you have to quote the *beginning*, not the end.  E.g., the
conventional style would be 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC
on PCIe features").

Bjorn



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux