On 6/13/24 7:39 PM, Duan, Zhenzhong wrote: > Hi > >> -----Original Message----- >> From: Kuppuswamy Sathyanarayanan >> <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> >> Subject: Re: [PATCH v4 1/3] PCI/AER: Store UNCOR_STATUS bits that might >> be ANFE in aer_err_info >> >> Hi, >> >> On 5/9/24 1:48 AM, Zhenzhong Duan wrote: >>> In some cases the detector of a Non-Fatal Error(NFE) is not the most >>> appropriate agent to determine the type of the error. For example, >>> when software performs a configuration read from a non-existent >>> device or Function, completer will send an ERR_NONFATAL Message. >>> On some platforms, ERR_NONFATAL results in a System Error, which >>> breaks normal software probing. >>> >>> Advisory Non-Fatal Error(ANFE) is a special case that can be used >>> in above scenario. It is predominantly determined by the role of the >>> detecting agent (Requester, Completer, or Receiver) and the specific >>> error. In such cases, an agent with AER signals the NFE (if enabled) >>> by sending an ERR_COR Message as an advisory to software, instead of >>> sending ERR_NONFATAL. >>> >>> When processing an ANFE, ideally both correctable error(CE) status and >>> uncorrectable error(UE) status should be cleared. However, there is no >>> way to fully identify the UE associated with ANFE. Even worse, Non-Fatal >>> Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as >>> NFE will reproduce above mentioned issue, i.e., breaking softwore probing; >>> treating NFE as ANFE will make us ignoring some UEs which need active >>> recover operation. To avoid clearing UEs that are not ANFE by accident, >>> the most conservative route is taken here: If any of the NFE Detected >>> bits is set in Device Status, do not touch UE status, they should be >>> cleared later by the UE handler. Otherwise, a specific set of UEs that >>> may be raised as ANFE according to the PCIe specification will be cleared >>> if their corresponding severity is Non-Fatal. >>> >>> To achieve above purpose, store UNCOR_STATUS bits that might be ANFE >>> in aer_err_info.anfe_status. So that those bits could be printed and >>> processed later. >>> >>> Tested-by: Yudong Wang <yudong.wang@xxxxxxxxx> >>> Co-developed-by: "Wang, Qingshun" <qingshun.wang@xxxxxxxxxxxxxxx> >>> Signed-off-by: "Wang, Qingshun" <qingshun.wang@xxxxxxxxxxxxxxx> >>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@xxxxxxxxx> >>> --- >>> drivers/pci/pci.h | 1 + >>> drivers/pci/pcie/aer.c | 53 >> ++++++++++++++++++++++++++++++++++++++++++ >>> 2 files changed, 54 insertions(+) >>> >>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h >>> index 17fed1846847..3f9eb807f9fd 100644 >>> --- a/drivers/pci/pci.h >>> +++ b/drivers/pci/pci.h >>> @@ -412,6 +412,7 @@ struct aer_err_info { >>> >>> unsigned int status; /* COR/UNCOR Error Status */ >>> unsigned int mask; /* COR/UNCOR Error Mask */ >>> + unsigned int anfe_status; /* UNCOR Error Status for ANFE */ >>> struct pcie_tlp_log tlp; /* TLP Header */ >>> }; >>> >>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c >>> index ac6293c24976..f2839b51321a 100644 >>> --- a/drivers/pci/pcie/aer.c >>> +++ b/drivers/pci/pcie/aer.c >>> @@ -107,6 +107,12 @@ struct aer_stats { >>> PCI_ERR_ROOT_MULTI_COR_RCV | >> \ >>> PCI_ERR_ROOT_MULTI_UNCOR_RCV) >>> >>> +#define AER_ERR_ANFE_UNC_MASK >> (PCI_ERR_UNC_POISON_TLP | \ >>> + PCI_ERR_UNC_COMP_TIME | >> \ >>> + PCI_ERR_UNC_COMP_ABORT | >> \ >>> + PCI_ERR_UNC_UNX_COMP | >> \ >>> + PCI_ERR_UNC_UNSUP) >>> + >>> static int pcie_aer_disable; >>> static pci_ers_result_t aer_root_reset(struct pci_dev *dev); >>> >>> @@ -1196,6 +1202,49 @@ void aer_recover_queue(int domain, unsigned >> int bus, unsigned int devfn, >>> EXPORT_SYMBOL_GPL(aer_recover_queue); >>> #endif >>> >>> +static void anfe_get_uc_status(struct pci_dev *dev, struct aer_err_info >> *info) >>> +{ >>> + u32 uncor_mask, uncor_status, anfe_status; >>> + u16 device_status; >>> + int aer = dev->aer_cap; >>> + >>> + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, >> &uncor_status); >>> + pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK, >> &uncor_mask); >>> + /* >>> + * According to PCIe Base Specification Revision 6.1, >>> + * Section 6.2.3.2.4, if an UNCOR error is raised as >>> + * Advisory Non-Fatal error, it will match the following >>> + * conditions: >>> + * a. The severity of the error is Non-Fatal. >>> + * b. The error is one of the following: >>> + * 1. Poisoned TLP (Section 6.2.3.2.4.3) >>> + * 2. Completion Timeout (Section 6.2.3.2.4.4) >>> + * 3. Completer Abort (Section 6.2.3.2.4.1) >>> + * 4. Unexpected Completion (Section 6.2.3.2.4.5) >>> + * 5. Unsupported Request (Section 6.2.3.2.4.1) >>> + */ >>> + anfe_status = uncor_status & ~uncor_mask & ~info->severity & >>> + AER_ERR_ANFE_UNC_MASK; >>> + >>> + if (pcie_capability_read_word(dev, PCI_EXP_DEVSTA, >> &device_status)) >>> + return; >>> + /* >>> + * Take the most conservative route here. If there are Non-Fatal >> errors >>> + * detected, do not assume any bit in uncor_status is set by ANFE. >>> + */ >>> + if (device_status & PCI_EXP_DEVSTA_NFED) >>> + return; >> You can move this check to the top of the function. You don't need to check >> the rest if NFE error is detected in device status. > The v3 just worked that way. Jonathan pointed a race that NFE triggered after > the check will be treated as ANFE and cleared. Check it after reading UNCOR_STATUS > can avoid the race. > > See https://lkml.org/lkml/2024/4/22/1011 for discussion details. Got it. I would recommend adding a comment about it in handler. May be some thing like, /* * To avoid race between device status read and error status register read, cache * uncorrectable error status before checking for NFE in device status * register. */ > > Thanks > Zhenzhong > >>> + >>> + /* >>> + * If there is another ANFE between reading uncor_status and >> clearing >>> + * PCI_ERR_COR_ADV_NFAT bit in cor_status register, that ANFE >> isn't >>> + * recorded in info->anfe_status. It will be read out as NFE in >>> + * following uncor_status register reading and processed by NFE >>> + * handler. >>> + */ >>> + info->anfe_status = anfe_status; >>> +} >>> + >>> /** >>> * aer_get_device_error_info - read error status from dev and store it to >> info >>> * @dev: pointer to the device expected to have a error record >>> @@ -1213,6 +1262,7 @@ int aer_get_device_error_info(struct pci_dev >> *dev, struct aer_err_info *info) >>> /* Must reset in this function */ >>> info->status = 0; >>> + info->anfe_status = 0; >>> info->tlp_header_valid = 0; >>> >>> /* The device might not support AER */ >>> @@ -1226,6 +1276,9 @@ int aer_get_device_error_info(struct pci_dev >> *dev, struct aer_err_info *info) >>> &info->mask); >>> if (!(info->status & ~info->mask)) >>> return 0; >>> + >>> + if (info->status & PCI_ERR_COR_ADV_NFAT) >>> + anfe_get_uc_status(dev, info); >>> } else if (type == PCI_EXP_TYPE_ROOT_PORT || >>> type == PCI_EXP_TYPE_RC_EC || >>> type == PCI_EXP_TYPE_DOWNSTREAM || >> -- >> Sathyanarayanan Kuppuswamy >> Linux Kernel Developer -- Sathyanarayanan Kuppuswamy Linux Kernel Developer