Re: [PATCH] PCI: vmd: Enable Hotplug based on BIOS setting on VMD rootports

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 8, 2023 at 12:30 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
>
> [+cc Rafael, just FYI re 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC on PCIe features")]
>
> On Tue, Nov 07, 2023 at 02:50:57PM -0700, Nirmal Patel wrote:
> > On Thu, 2023-11-02 at 16:49 -0700, Nirmal Patel wrote:
> > > On Thu, 2023-11-02 at 15:41 -0500, Bjorn Helgaas wrote:
> > > > On Thu, Nov 02, 2023 at 01:07:03PM -0700, Nirmal Patel wrote:
> > > > > On Wed, 2023-11-01 at 17:20 -0500, Bjorn Helgaas wrote:
> > > > > > On Tue, Oct 31, 2023 at 12:59:34PM -0700, Nirmal Patel wrote:
> > > > > > > On Tue, 2023-10-31 at 10:31 -0500, Bjorn Helgaas wrote:
> > > > > > > > On Mon, Oct 30, 2023 at 04:16:54PM -0400, Nirmal Patel
> > > > > > > > wrote:
> > > > > > > > > VMD Hotplug should be enabled or disabled based on VMD
> > > > > > > > > rootports' Hotplug configuration in BIOS.
> > > > > > > > > is_hotplug_bridge
> > > > > > > > > is set on each VMD rootport based on Hotplug capable bit
> > > > > > > > > in
> > > > > > > > > SltCap in probe.c.  Check is_hotplug_bridge and enable or
> > > > > > > > > disable native_pcie_hotplug based on that value.
> > > > > > > > >
> > > > > > > > > Currently VMD driver copies ACPI settings or platform
> > > > > > > > > configurations for Hotplug, AER, DPC, PM, etc and enables
> > > > > > > > > or
> > > > > > > > > disables these features on VMD bridge which is not
> > > > > > > > > correct
> > > > > > > > > in case of Hotplug.
> > > > > > > >
> > > > > > > > This needs some background about why it's correct to copy
> > > > > > > > the
> > > > > > > > ACPI settings in the case of AER, DPC, PM, etc, but
> > > > > > > > incorrect
> > > > > > > > for hotplug.
> > > > > > > >
> > > > > > > > > Also during the Guest boot up, ACPI settings along with
> > > > > > > > > VMD
> > > > > > > > > UEFI driver are not present in Guest BIOS which results
> > > > > > > > > in
> > > > > > > > > assigning default values to Hotplug, AER, DPC, etc. As a
> > > > > > > > > result Hotplug is disabled on VMD in the Guest OS.
> > > > > > > > >
> > > > > > > > > This patch will make sure that Hotplug is enabled
> > > > > > > > > properly
> > > > > > > > > in Host as well as in VM.
> > > > > > > >
> > > > > > > > Did we come to some consensus about how or whether _OSC for
> > > > > > > > the host bridge above the VMD device should apply to
> > > > > > > > devices
> > > > > > > > in the separate domain below the VMD?
> > > > > > >
> > > > > > > We are not able to come to any consensus. Someone suggested
> > > > > > > to
> > > > > > > copy either all _OSC flags or none. But logic behind that
> > > > > > > assumption is that the VMD is a bridge device which is not
> > > > > > > completely true. VMD is an endpoint device and it owns its
> > > > > > > domain.
> > > > > >
> > > > > > Do you want to facilitate a discussion in the PCI firmware SIG
> > > > > > about this?  It seems like we may want a little text in the
> > > > > > spec
> > > > > > about how to handle this situation so platforms and OSes have
> > > > > > the
> > > > > > same expectations.
> > > > >
> > > > > The patch 04b12ef163d1 broke intel VMD's hotplug capabilities and
> > > > > author did not test in VM environment impact.
> > > > > We can resolve the issue easily by
> > > > >
> > > > > #1 Revert the patch which means restoring VMD's original
> > > > > functionality
> > > > > and author provide better fix.
> > > > >
> > > > > or
> > > > >
> > > > > #2 Allow the current change to re-enable VMD hotplug inside VMD
> > > > > driver.
> > > > >
> > > > > There is a significant impact for our customers hotplug use cases
> > > > > which
> > > > > forces us to apply the fix in out-of-box drivers for different
> > > > > OSs.
> > > >
> > > > I agree 100% that there's a serious problem here and we need to fix
> > > > it, there's no argument there.
> > > >
> > > > I guess you're saying it's obvious that an _OSC above VMD does not
> > > > apply to devices below VMD, and therefore, no PCI Firmware SIG
> > > > discussion or spec clarification is needed?
> > >
> > > Yes. By design VMD is an endpoint device to OS and its domain is
> > > privately owned by VMD only. I believe we should revert back to
> > > original design and not impose _OSC settings on VMD domain which is
> > > also a maintainable solution.
> >
> > I will send out revert patch. The _OSC settings shouldn't apply
> > to private VMD domain.
>
> I assume you mean to revert 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC
> on PCIe features").  That appeared in v5.17, and it fixed (or at least
> prevented) an AER message flood.  We can't simply revert 04b12ef163d1
> unless we first prevent that AER message flood in another way.

The error is "correctable".
Does masking all correctable AER error by default make any sense? And
add a sysfs knob to make it optional.

Kai-Heng

>
> Bjorn
>
> > Even the patch 04b12ef163d1 needs more changes to make sure _OSC
> > settings are passed on from Host BIOS to Guest BIOS which means
> > involvement of ESXi, Windows HyperV, KVM.




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux