On Fri, Apr 19, 2024 at 03:18:19PM -0700, Paul M Stillwell Jr wrote: > On 4/19/2024 2:14 PM, Bjorn Helgaas wrote: > > On Thu, Apr 18, 2024 at 02:51:19PM -0700, Paul M Stillwell Jr wrote: > > > On 4/18/2024 11:26 AM, Bjorn Helgaas wrote: > > > > On Wed, Apr 17, 2024 at 01:15:42PM -0700, Paul M Stillwell Jr wrote: > > > > > Adding documentation for the Intel VMD driver and updating the index > > > > > file to include it. > > > > - Which devices are passed through to a virtual guest and enumerated > > > > there? > > > > > > All devices under VMD are passed to a virtual guest > > > > So the guest will see the VMD Root Ports, but not the VMD RCiEP > > itself? > > The guest will see the VMD device and then the vmd driver in the guest will > enumerate the devices behind it is my understanding > > > > > - Where does the vmd driver run (host or guest or both)? > > > > > > I believe the answer is both. > > > > If the VMD RCiEP isn't passed through to the guest, how can the vmd > > driver do anything in the guest? > > The VMD device is passed through to the guest. It works just like bare metal > in that the guest OS detects the VMD device and loads the vmd driver which > then enumerates the devices into the guest I guess it's obvious that the VMD RCiEP must be passed through to the guest because the whole point of https://lore.kernel.org/linux-pci/20240408183927.135-1-paul.m.stillwell.jr@xxxxxxxxx/ is to do something in the guest. It does puzzle me that we have two copies of the vmd driver (one in the host OS and another in the guest OS) that think they own the same physical device. I'm not a virtualization guru but that sounds potentially problematic. > > IIUC, the current situation is "regardless of what firmware said, in > > the VMD domain we want AER disabled and hotplug enabled." > > We aren't saying we want AER disabled, we are just saying we want hotplug > enabled. The observation is that in a hypervisor scenario AER is going to be > disabled because the _OSC bits are all 0. 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC on PCIe features") is saying we want AER disabled for the VMD domain, isn't it? > > It seems like the only clear option is to say "the vmd driver owns all > > PCIe services in the VMD domain, the platform does not supply _OSC for > > the VMD domain, the platform can't do anything with PCIe services in > > the VMD domain, and the vmd driver needs to explicitly enable/disable > > services as it needs." > > I actually looked at this as well :) I had an idea to set the _OSC bits to 0 > when the vmd driver created the domain. The look at all the root ports > underneath it and see if AER and PM were set. If any root port underneath > VMD set AER or PM then I would set the _OSC bit for the bridge to 1. That > way if any root port underneath VMD had enabled AER (as an example) then > that feature would still work. I didn't test this in a hypervisor scenario > though so not sure what I would see. _OSC negotiates ownership of features between platform firmware and OSPM. The "native_pcie_hotplug" and similar bits mean that "IF a device advertises the feature, the OS can use it." We clear those native_* bits if the platform retains ownership via _OSC. If BIOS doesn't enable the VMD host bridge and doesn't supply _OSC for the domain below it, why would we assume that BIOS retains ownership of the features negotiated by _OSC? I think we have to assume the OS owns them, which is what happened before 04b12ef163d1. Bjorn