On 2/26/24 8:33 AM, Bjorn Helgaas wrote: > On Mon, Feb 26, 2024 at 07:46:05AM -0800, Kuppuswamy Sathyanarayanan wrote: >> On 2/26/24 7:18 AM, Bjorn Helgaas wrote: >>> On Sun, Feb 25, 2024 at 11:46:07AM -0800, Kuppuswamy Sathyanarayanan wrote: >>>> On 2/22/24 2:15 PM, Bjorn Helgaas wrote: >>>>> From: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> >>>>> >>>>> When booting with "pci=noaer", we don't request control of AER, but we >>>>> previously *did* request control of DPC, as in the dmesg log attached at >>>>> the bugzilla below: >>>>> >>>>> Command line: ... pci=noaer >>>>> acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3] >>>>> acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC] >>>>> >>>>> That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which >>>>> says: >>>>> >>>>> If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it >>>>> must also set bit 7 of the Support field (indicating support for Error >>>>> Disconnect Recover notifications) and bits 3 and 4 of the Control field >>>>> (requesting control of PCI Express Advanced Error Reporting and the PCI >>>>> Express Capability Structure). >>>> IIUC, this dependency is discussed in sec 4.5.2.4. "Dependencies >>>> Between _OSC Control Bits". >>>> >>>> Because handling of Downstream Port Containment has a dependency on >>>> Advanced Error Reporting, the operating system is required to >>>> request control over Advanced Error Reporting (bit 3 of the Control >>>> field) while requesting control over Downstream Port Containment >>>> Configuration (bit 7 of the Control field). If the operating system >>>> attempts to claim control of Downstream Port Containment >>>> Configuration without also claiming control over Advanced Error >>>> Reporting, firmware is required to refuse control of the feature >>>> being illegally claimed and mask the corresponding bit. Firmware is >>>> required to maintain ownership of Advanced Error Reporting if it >>>> retains ownership of Downstream Port Containment Configuration. If >>>> the operating system sets bit 7 of the Control field, it must set >>>> bit 7 of the Support field, indicating support for the Error >>>> Disconnect Recover event. >>> So I guess you're suggesting that there are two defects here? >>> >>> 1) Linux requested DPC control without requesting AER control. >>> >>> 2) Platform granted DPC control when it shouldn't have. >>> >>> I do agree with that, but obviously we can only fix 1) in Linux. >> Sorry, maybe my comment was not clear. I was just suggesting to >> change the spec reference from r3.3, sec 4.5.1, table 4-5 to r3.3, >> sec 4.5.2.4 "Dependencies Between _OSC Control Bits". > The requirement that the OS request AER control whenever it requests > DPC control is mentioned in both sec 4.5.1 and sec 4.5.2.4. IMO sec > 4.5.2.4 should not exist because the per-bit table in sec 4.5.1 is a > better place for implementation guidance. 4.5.2.4 is easy to miss, > mostly redundant, and hard to integrate with the 4.5.1 table. > > What advantage do you see for citing 4.5.2.4 instead of 4.5.1? The > only real difference I see is that it also points out a firmware > problem. I don't think the extra text is worth it since it doesn't > motivate the Linux change. My thinking is, since the fix is related to the dependency between _OSC control bits (AER & DPC), and there is a special section in the spec which discuss it, I thought it is better to quote it. But I get your point. I think either if fine. > > Bjorn -- Sathyanarayanan Kuppuswamy Linux Kernel Developer