On Friday 22 October 2021 10:04:36 Florian Fainelli wrote: > On 10/5/21 7:07 PM, Florian Fainelli wrote: > > > > > > On 10/5/2021 3:25 PM, Jeremy Linton wrote: > >> Hi, > >> > >> On 10/5/21 2:43 PM, Pali Rohár wrote: > >>> Hello! > >>> > >>> On Tuesday 05 October 2021 10:57:18 Jeremy Linton wrote: > >>>> Hi, > >>>> > >>>> On 10/5/21 10:32 AM, Bjorn Helgaas wrote: > >>>>> On Thu, Aug 26, 2021 at 02:15:55AM -0500, Jeremy Linton wrote: > >>>>>> Additionally, some basic bus/device filtering exist to avoid sending > >>>>>> config transactions to invalid devices on the RP's primary or > >>>>>> secondary bus. A basic link check is also made to assure that > >>>>>> something is operational on the secondary side before probing the > >>>>>> remainder of the config space. If either of these constraints are > >>>>>> violated and a config operation is lost in the ether because an EP > >>>>>> doesn't respond an unrecoverable SERROR is raised. > >>>>> > >>>>> It's not "lost"; I assume the root port raises an error because it > >>>>> can't send a transaction over a link that is down. > >>>> > >>>> The problem is AFAIK because the root port doesn't do that. > >>> > >>> Interesting! Does it mean that PCIe Root Complex / Host Bridge (which I > >>> guess contains also logic for Root Port) does not signal transaction > >>> failure for config requests? Or it is just your opinion? Because I'm > >>> dealing with similar issues and I'm trying to find a way how to detect > >>> if some PCIe IP signal transaction error via AXI SLVERR response OR it > >>> just does not send any response back. So if you know some way how to > >>> check which one it is, I would like to know it too. > >> > >> This is my _opinion_ based on what I've heard of some other IP > >> integration issues, and what i've seen poking at this one from the > >> perspective of a SW guy rather than a HW guy. So, basically worthless. > >> But, you should consider that most of these cores/interconnects aren't > >> aware of PCIe completion semantics so its the root ports > >> responsibility to say, gracefully translate a non-posted write that > >> doesn't have a completion for the interconnects its attached to, > >> rather than tripping something generic like a SLVERR. > >> > >> Anyway, for this I would poke around the pile of exception registers, > >> with your specific processors manual handy because a lot of them are > >> implementation defined. > > > > I should be able to get you an answer in the new few days whether > > configuration space requests also generate an error towards the ARM CPU, > > since memory space requests most definitively do. > > Did not get an answer from the design team, but going through our bug > tracker, there were evidences of configuration space accesses also > generating external aborts: > > [ 8.988237] Unhandled fault: synchronous external abort (0x96000210) at 0xffffff8009539004 > [ 9.026698] PC is at pci_generic_config_read32+0x30/0xb0 So this is error caused by reading from config space. Can you check if also writing to config space can trigger some crash? If yes, I would like to know if write would be also synchronous or rather asynchronous abort.