Re: [PATCH v2 1/2] PCI: pciehp: Add support for async hotplug with native AER and DPC/EDR

Smita Koralahalli <Smita.KoralahalliChannabasappa@xxxxxxx> · Mon, 22 May 2023 15:23:57 -0700

Hi,

On 5/16/2023 3:10 AM, Lukas Wunner wrote:
On Tue, Apr 18, 2023 at 09:05:25PM +0000, Smita Koralahalli wrote:
According to Section 6.7.6 of PCIe Base Specification [1], async removal
with DPC and EDR may be unexpected and may result in a Surprise Down error.
This error is just a side effect of hot remove. Most of the time, these
errors will be abstract to the OS as current systems rely on Firmware-First
model for AER and DPC, where the error handling (side effects of async
remove) and other necessary HW sequencing actions is taken care by the FW
and is abstract to the OS. However, FW-First model poses issues while
rolling out updates or fixing bugs as the servers need to be brought down
for firmware updates.

Add support for async hot-plug with native AER and DPC/EDR. Here, OS is
responsible for handling async add and remove along with handling of AER
and DPC events which are generated as a side-effect of async remove.

I think you can probably leave out the details about Firmware-First.
Pointing to 6.7.6 and the fact that Surprise Down Errors may occur as
an expected side-effect of surprise removal is sufficient.  They should
be treated as such.

You also want to point out what you're trying to achieve here, i.e. get
rid of irritating log messages and a 1 sec delay while pciehp waits for
DPC recovery.

Okay.

Please note that, I have provided explanation why I'm not setting the
Surprise Down bit in uncorrectable error mask register in AER.
https://lore.kernel.org/all/fba22d6b-c225-4b44-674b-2c62306135ed@xxxxxxx/

Add an explanation to the commit message that masking Surprise Down Errors
was explored as an alternative approach, but scrapped due to the odd
behavior that masking only avoids the interrupt, but still records an
error per PCIe r6.0.1 sec 6.2.3.2.2.  That stale error is going to be
reported the next time some error other than Surprise Down is handled.

Okay.

Also, while testing I noticed PCI_STATUS and PCI_EXP_DEVSTA will be set
on an async remove and will not be cleared while the device is brought
down. I have included clearing them here in order to mask any kind of
appearance that there was an error and as well duplicating our BIOS
functionality. I can remove if its not necessary.

Which bits are set exactly?  Can you constrain the register write to
those bits?

Hmm, I was mostly trying to follow the similar approach done for AER.
pci_aer_raw_clear_status(), where they clear status registers 
unconditionally. Also, thought might be better if we are dealing with 
legacy endpoints or so.

I see these bits set in status registers:
PCI_ERR_UNCOR_STATUS 0x20 (Surprise Down)
PCI_EXP_DPC_RP_PIO_STATUS 0x10000 (Memory Request received URCompletion)
PCI_EXP_DEVSTA 0x604 (fatal error detected)

+static void dpc_handle_surprise_removal(struct pci_dev *pdev)
+{
+	if (pdev->dpc_rp_extensions && dpc_wait_rp_inactive(pdev))
+		return;

Emit an error message here?

Okay

+	/*
+	 * According to Section 6.7.6 of the PCIe Base Spec 6.0, since async
+	 * removal might be unexpected, errors might be reported as a side
+	 * effect of the event and software should handle them as an expected
+	 * part of this event.
+	 */

I'd move that code comment to into dpc_handler() above the
"if (dpc_is_surprise_removal(pdev))" check.

Okay.

+	pci_aer_raw_clear_status(pdev);
+	pci_clear_surpdn_errors(pdev);
+
+	pci_write_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_STATUS,
+			      PCI_EXP_DPC_STATUS_TRIGGER);
+}

Do you need a "wake_up_all(&dpc_completed_waitqueue);" at the end
of the function to wake up a pciehp handler waiting for DPC recovery?

I don't think so. The pciehp handler is however getting invoked 
simultaneously due to PDSC or DLLSC state change right.. Let me know if 
I'm missing anything here.

+static bool dpc_is_surprise_removal(struct pci_dev *pdev)
+{
+	u16 status;
+
+	pci_read_config_word(pdev, pdev->aer_cap + PCI_ERR_UNCOR_STATUS, &status);
+
+	if (!(status & PCI_ERR_UNC_SURPDN))
+		return false;
+

You need an additional check for pdev->is_hotplug_bridge here.

And you need to read PCI_EXP_SLTCAP and check for PCI_EXP_SLTCAP_HPS.

Return false if either of them isn't set.

Return false, if PCI_EXP_SLTCAP isn't set only correct? 
PCI_EXP_SLTCAP_HPS should be disabled if DPC is enabled.

Implementation notes in 6.7.6 says that:
"The Hot-Plug Surprise (HPS) mechanism, as indicated by the Hot-Plug
Surprise bit in the Slot Capabilities Register being Set, is deprecated
for use with async hot-plug. DPC is the recommended mechanism for 
supporting async hot-plug."

Platform FW will disable the SLTCAP_HPS bit at boot time to enable async 
hotplug on AMD devices.

Probably check if SLTCAP_HPS bit is set and return false?

+	dpc_handle_surprise_removal(pdev);
+
+	return true;
+}

A function named "..._is_..." should not have any side effects.
So move the dpc_handle_surprise_removal() call down into dpc_handler()
before the "return IRQ_HANDLED;" statement.

Okay.

What about ports which support AER but not DPC?  Don't you need to
amend aer.c in that case?  I suppose you don't have hardware without
DPC to test that?

Yeah right. Also, if DPC isn't supported we leave HPS bit set and we 
won't see the DPC event at that time.

Thanks,
Smita

Thanks,

Lukas