On 8/6/2023 2:43 PM, Pali Rohár wrote:
On Sunday 06 August 2023 06:44:50 Lukas Wunner wrote:The Broadcom Set Top Box PCIe controller signals an Asynchronous SError InterruptThis is little incorrect wording. PCIe controller cannot send Async SError, this is ARMv8 specific thing. In this case PCIe controller is connected to ARM core via AXI bus and on PCIe transaction timeout it sends AXI Slave Error, which then ARMv8 core reports to kernel as Async SError interrupt.
That is indeed a better way to explain the issue. FWIW, on BCM2711 the PCIe core connects via SCB and then AXI towards the ARMv8 CPU, does not change a thing about your paragraph.
The proper fix is to configure PCIe controller to never send AXI Slave Error and neither AXI Decode Error (to prevent SErrors at all). For example Synopsys PCIe controllers have proprietary hidden configuration bits for enabling/disabling this AXI error reporting behavior.
That does not exist with the version of the block present in BCM2711 unfortunately.
Or second option is to access affected memory from the ARMv8 core via synchronous operations and map memory as nGnRnE. Then ARMv8 core reports AXI Slave Error as Synchronous Abort Exception which you can catch, examine that was caused on PCIe memory region and fabricate all-ones response. But the second option is not available for some licensed ARMv8 Cortex cores (e.g. A53) as they do not implement nE (non Early Write Acknowledgement) memory mapping correctly.
BCM2711 uses Cortex-A72 do these cores still not implement nE correctly? Do you have a reference backing up that claim (not disputing it, just curious).
The patch below does not fix the issue at all, instead it opens a new race condition that if link state is changed _after_ the check and _before_ accessing config space.
Fair enough, but in the situation you describe there is not much that can be done anyway so we might as well deal with a narrowed window?
Thanks for reviewing. -- Florian
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature