Re: [PATCH] PCI: vmd: VMD to control Hotplug on its rootports

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/13/2023 9:58 AM, Bjorn Helgaas wrote:
> [+cc Jonathan, Lorenzo, Krzysztof, Rob (from MAINTAINERS)]
>
> Can you make the subject line say what the patch does?  Repeating
> "VMD" is probably unnecessary.
>
> On Wed, Jul 05, 2023 at 01:20:38PM -0400, Nirmal Patel wrote:
>> The hotplug functionality is broken in various combinations of guest
>> OSes i.e. RHEL, SlES and hypervisors i.e. KVM and ESXI.
> "SLES"

Will fix it.

>
>> VMD enabled on Intel ADL cpus caused interrupt storm for smasung
>> drives due to AER being enabled on VMD controlled root ports.
> "Samsung"
Will fix it.
>
> Enabling AER should not cause an interrupt storm.  There must be
> something else going on in addition.  Are you saying the Samsung
> drives have some AER-related defect, like generating Correctable
> Errors when they shouldn't?  It doesn't sound like "Intel ADL" would
> be relevant here.
>
> It's not clear to me if this is directly related to *hotplug* or if
> it's an AER issue that may happen because of hotplug and possible for
> other reasons.
>
>> The patch 04b12ef163d10e348db664900ae7f611b83c7a0e
> 12 char SHA1 is sufficient.
Will fix it.
>
>> ("PCI: vmd: Honor APCI _OSC on PCIe features.") was added to the VMD
>> driver to correct the issue based on the following assumption:
>> 	“Since VMD is an aperture to regular PCIe root ports, honor ACPI
>> 	_OSC to disable PCIe features accordingly to resolve the issue.”
>> 	Link: https://lore.kernel.org/r/20211203031541.1428904-1-kai.heng.feng@xxxxxxxxxxxxx
>>
>> VMD as a PCIe device is an end point(type 0), not a PCIe aperture
>> (pcie bridge). In fact VMD is a type 0 raid controller(class code).
>> When BIOS boots, all root ports under VMD is inaccessible by BIOS, and
>> as such, they maintain their power on default states. The VMD UEFI DXE
>> driver loads and configure all devices under VMD. This is how AER,
>> power management and hotplug gets enabled in UEFI, since the BIOS pci
>> driver cannot access the root ports.
> s/pcie/PCIe/
> s/pci/PCI/
> s/raid/RAID/
Will fix it.
>
>> The patch worked around the interrupt storm by assigning the native
> I assume "the patch" means 04b12ef163d1.  Sometimes people write "the
> patch does X" in the commit log for the current patch, so it's nice to
> be specific to avoid confusion.
>
>> ACPI states to the  root ports under VMD. It assigns AER, hotplug,
>> PME, etc. These have been restored back to the power on default state
>> in guest OS, which says the root port hot plug enable is default OFF.
>> At most, the work around should have only assigned AER state.
>> An additional patch should be added to exclude hot plug from the
>> original patch.
> Add blank line between paragraphs.
Will fix it.
>
>> This will cause hot plug to start working again in the guest, as the
>> settings implemented by the UEFI VMD DXE driver will remain in effect
>> in Linux.
> This is a lot of description that doesn't seem to say what the actual
> underlying issue is.  It's basically "if we treat hotplug differently
> from other negotiated features, things work better."  And it seems
> like it depends on what the UEFI driver did, and we should try to
> avoid a dependency there.
>
> If the issue is too many Correctable Errors being reported via AER, we
> have that problem regardless of VMD, and we should handle it by
> rate-limiting and/or suppressing them completely for particularly
> offensive devices.  We have some issues like this that are still
> outstanding:
>
> [1] https://lore.kernel.org/linux-pci/DM6PR04MB6473197DBD89FF4643CC391F8BC19@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
> [2] https://lore.kernel.org/linux-pci/20230606035256.2886098-2-grundler@xxxxxxxxxxxx/
Sorry for the long description. I think your understanding of treating
hotplug differently from other features is correct. The hotplug issue
is seen only in guest because guest BIOS doesn't know about such
settings nor have an implementation. Ideally, when hotplug is enabled
in Host BIOS, VMD root ports in guest should obtain same values with
the help of direct assign feature. The patch (04b12ef163d1) overwrites
these values and assigns default values(i.e 0h for hotplug) resulting
in hotplug disabled.

Regarding AER, I think we should leave other settings as it is. During
the initialization, AER is enabled on VMD root-port without knowing if
the platform firmware has a handler. Thanks to the patch(04b12ef163d1),
this mistake was corrected.

>> Signed-off-by: Nirmal Patel <nirmal.patel@xxxxxxxxxxxxxxx>
>> ---
>>  drivers/pci/controller/vmd.c | 2 --
>>  1 file changed, 2 deletions(-)
>>
>> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
>> index 769eedeb8802..52c2461b4761 100644
>> --- a/drivers/pci/controller/vmd.c
>> +++ b/drivers/pci/controller/vmd.c
>> @@ -701,8 +701,6 @@ static int vmd_alloc_irqs(struct vmd_dev *vmd)
>>  static void vmd_copy_host_bridge_flags(struct pci_host_bridge *root_bridge,
>>  				       struct pci_host_bridge *vmd_bridge)
>>  {
>> -	vmd_bridge->native_pcie_hotplug = root_bridge->native_pcie_hotplug;
>> -	vmd_bridge->native_shpc_hotplug = root_bridge->native_shpc_hotplug;
>>  	vmd_bridge->native_aer = root_bridge->native_aer;
>>  	vmd_bridge->native_pme = root_bridge->native_pme;
>>  	vmd_bridge->native_ltr = root_bridge->native_ltr;
>> -- 
>> 2.31.1
>>




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux