Re: [PATCH] Documentation: PCI: add vmd documentation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 25, 2024 at 04:32:21PM -0700, Paul M Stillwell Jr wrote:
> On 4/25/2024 3:32 PM, Bjorn Helgaas wrote:
> > On Thu, Apr 25, 2024 at 02:43:07PM -0700, Paul M Stillwell Jr wrote:
> > > On 4/25/2024 10:24 AM, Bjorn Helgaas wrote:
> > > > On Wed, Apr 24, 2024 at 02:29:16PM -0700, Paul M Stillwell Jr wrote:
> > > > > On 4/23/2024 5:47 PM, Bjorn Helgaas wrote:
> > > ...
> > 
> > > > > > _OSC is the only mechanism for negotiating ownership of these
> > > > > > features, and PCI Firmware r3.3, sec 4.5.1, is pretty clear that _OSC
> > > > > > only applies to the hierarchy originated by the PNP0A03/PNP0A08 host
> > > > > > bridge that contains the _OSC method.  AFAICT, there's no
> > > > > > global/device-specific thing here.
> > > > > > 
> > > > > > The BIOS may have a single user-visible setting, and it may apply that
> > > > > > setting to all host bridge _OSC methods, but that's just part of the
> > > > > > BIOS UI, not part of the firmware/OS interface.
> > > > > 
> > > > > Fair, but we are still left with the question of how to set the _OSC bits
> > > > > for the VMD bridge. This would normally happen using ACPI AFAICT and we
> > > > > don't have that for the devices behind VMD.
> > > > 
> > > > In the absence of a mechanism for negotiating ownership, e.g., an ACPI
> > > > host bridge device for the hierarchy, the OS owns all the PCIe
> > > > features.
> > > 
> > > I'm new to this space so I don't know what it means for the OS to
> > > own the features. In other words, how would the vmd driver figure
> > > out what features are supported?
> > 
> > There are three things that go into this:
> > 
> >    - Does the OS support the feature, e.g., is CONFIG_PCIEAER enabled?
> > 
> >    - Has the platform granted permission to the OS to use the feature,
> >      either explicitly via _OSC or implicitly because there's no
> >      mechanism to negotiate ownership?
> > 
> >    - Does the device advertise the feature, e.g., does it have an AER
> >      Capability?
> > 
> > If all three are true, Linux enables the feature.
> > 
> > I think vmd has implicit ownership of all features because there is no
> > ACPI host bridge device for the VMD domain, and (IMO) that means there
> > is no way to negotiate ownership in that domain.
> > 
> > So the VMD domain starts with all the native_* bits set, meaning Linux
> > owns the features.  If the vmd driver doesn't want some feature to be
> > used, it could clear the native_* bit for it.
> > 
> > I don't think vmd should unilaterally claim ownership of features by
> > *setting* native_* bits because that will lead to conflicts with
> > firmware.
> 
> This is the crux of the problem IMO. I'm happy to set the native_* bits
> using some knowledge about what the firmware wants, but we don't have a
> mechanism to do it AFAICT. I think that's what commit 04b12ef163d1 ("PCI:
> vmd: Honor ACPI _OSC on PCIe features") was trying to do: use a domain that
> ACPI had run on and negotiated features and apply them to the vmd domain.

Yes, this is the problem.  We have no information about what firmware
wants for the VMD domain because there is no ACPI host bridge device.

We have information about what firmware wants for *other* domains.
04b12ef163d1 assumes that also applies to the VMD domain, but I don't
think that's a good idea.

> Using the 3 criteria you described above, could we do this for the hotplug
> feature for VMD:
> 
> 1. Use IS_ENABLED(CONFIG_<whatever hotplug setting we need>) to check to see
> if the hotplug feature is enabled

That's already there.

> 2. We know that for VMD we want hotplug enabled so that is the implicit
> permission

The VMD domain starts with all native_* bits set.  All you have to do
is avoid *clearing* them.

The problem (IMO) is that 04b12ef163d1 clears bits based on the _OSC
for some other domain.

> 3. Look at the root ports below VMD and see if hotplug capability is set

This is already there, too.

> If 1 & 3 are true, then we set the native_* bits for hotplug (we can look
> for surprise hotplug as well in the capability to set the
> native_shpc_hotplug bit corrrectly) to 1. This feels like it would solve the
> problem of "what if there is a hotplug issue on the platform" because the
> user would have disabled hotplug for VMD and the root ports below it would
> have the capability turned off.
> 
> In theory we could do this same thing for all the features, but we don't
> know what the firmware wants for features other than hotplug (because we
> implicitly know that vmd wants hotplug). I feel like 04b12ef163d1 is a good
> compromise for the other features, but I hear your issues with it.
> 
> I'm happy to "do the right thing" for the other features, I just can't
> figure out what that thing is :)

04b12ef163d1 was motivated by a flood of Correctable Errors.

Kai-Heng says the errors occur even when booting with
"pcie_ports=native" and VMD turned off, i.e., when the VMD RCiEP is
disabled and the NVMe devices appear under plain Root Ports in domain
0000.  That suggests that they aren't related to VMD at all.

I think there's a significant chance that those errors are caused by a
software defect, e.g., ASPM configuration.  There are many similar
reports of Correctable Errors where "pcie_aspm=off" is a workaround.

If we can nail down the root cause of those Correctable Errors, we may
be able to fix it and just revert 04b12ef163d1.  That would leave all
the PCIe features owned and enabled by Linux in the VMD domain.  AER
would be enabled and not a problem, hotplug would be enabled as you
need, etc.

There are a zillion reports of these errors and I added comments to
some to see if anybody can help us get to a root cause.

Bjorn




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux