Re: [PATCH v2] PCI/AER: Do not reset the port device status if doing firmware first handling.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[+cc Austin, _OSC expert]

On Mon, Jun 22, 2020 at 07:35:23PM +0800, Jonathan Cameron wrote:
> pci_aer_clear_device_status() currently resets the device status
> (PCI_EXP_DEVSTA) on the downstream port above a device, or the port itself
> if the port is the reported AER error source.  This happens even when error
> handling is firmware first.
> 
> Our interpretation is that firmware first handling means that the firmware
> will deal with clearing all relevant error reporting registers
> including this one.

IMO "firmware-first" is meaningless to the kernel.  I see the bit
defined in the ACPI HEST records (ACPI v6.3, sec 18.3.2.4), but there
is no indication of anything the OS needs to *do* with it.  It does
not influence the result of pcie_aer_is_native().  So I don't want to
mention it in the subject or commit log.

But I think what the _OSC negotiation for AER ownership is relevant,
and that's what your patch tests, so I think this is the right thing
to do.

So I applied this as below to pci/error for v5.8, thanks a lot!

Oh, I also propose a preliminary patch (posted and cc'd to you) to
rename pci_aer_clear_device_status():

  https://lore.kernel.org/r/20200717195619.766662-1-helgaas@xxxxxxxxxx

commit d6c8d24e3d5d ("PCI/ERR: Clear PCIe Device Status errors only if OS owns AER")
Author: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
Date:   Mon Jun 22 19:35:23 2020 +0800

    PCI/ERR: Clear PCIe Device Status errors only if OS owns AER
    
    pcie_clear_device_status() resets the error bits in the PCIe Device Status
    Register (PCI_EXP_DEVSTA).
    
    Previously we did this unconditionally, but on ACPI systems, the _OSC AER
    bit negotiates control of the AER capability.  Per sec 4.5.1 of the System
    Firmware Intermediary _OSC and DPC Updates ECN [1], this bit also covers
    other error enable/status bits including the following:
    
      Correctable Error Reporting Enable
      Non-Fatal Error Reporting Enable
      Fatal Error Reporting Enable
      Unsupported Request Reporting Enable
    
    These bits are all in the PCIe Device Control register (the ECN omitted
    "Reporting", but I think that's a typo), so by implication the _OSC AER bit
    also applies to the error status bits in the PCIe Device Status register:
    
      Correctable Error Detected
      Non-Fatal Error Detected
      Fatal Error Detected
      Unsupported Request Detected
    
    Clear the PCIe Device Status error bits only when the OS controls the AER
    capability and related error enable/status bits.  If platform firmware
    controls the AER capability, firmware is responsible for clearing these
    bits.
    
    One call path leading here is:
    
      ghes_do_proc
        ghes_handle_aer
          aer_recover_queue
            schedule_work(&aer_recover_work)
      ...
      aer_recover_work_func
        pcie_do_recovery
          pcie_clear_device_status
    
    [1] System Firmware Intermediary (SFI) _OSC and DPC Updates ECN, Feb 24,
        2020, affecting PCI Firmware Specification, Rev. 3.2
        https://members.pcisig.com/wg/PCI-SIG/document/14076
    [bhelgaas: commit log]
    Link: https://lore.kernel.org/r/20200622113523.891666-1-Jonathan.Cameron@xxxxxxxxxx
    Signed-off-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
    Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index d3ea667c8520..34bfea5c52b3 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -245,6 +245,9 @@ void pcie_clear_device_status(struct pci_dev *dev)
 {
 	u16 sta;
 
+	if (!pcie_aer_is_native(dev))
+		return;
+
 	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
 	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
 }



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux