Hi Kishon, > -----Original Message----- > From: Kishon Vijay Abraham I <kishon@xxxxxx> > Sent: 2020年10月1日 21:32 > To: Rob Herring <robh@xxxxxxxxxx> > Cc: Gustavo Pimentel <Gustavo.Pimentel@xxxxxxxxxxxx>; Z.q. Hou > <zhiqiang.hou@xxxxxxx>; Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx>; > linux-kernel@xxxxxxxxxxxxxxx; PCI <linux-pci@xxxxxxxxxxxxxxx>; Bjorn > Helgaas <bhelgaas@xxxxxxxxxx>; Michael Walle <michael@xxxxxxxx>; Ard > Biesheuvel <ardb@xxxxxxxxxx> > Subject: Re: [PATCH] PCI: dwc: Added link up check in map_bus of > dw_child_pcie_ops > > Hi Rob, > > On 30/09/20 8:31 pm, Rob Herring wrote: > > On Wed, Sep 30, 2020 at 8:22 AM Kishon Vijay Abraham I <kishon@xxxxxx> > wrote: > >> > >> Hi, > >> > >> On 29/09/20 10:41 pm, Rob Herring wrote: > >>> On Tue, Sep 29, 2020 at 10:24 AM Gustavo Pimentel > >>> <Gustavo.Pimentel@xxxxxxxxxxxx> wrote: > >>>> > >>>> On Tue, Sep 29, 2020 at 5:5:41, Z.q. Hou <zhiqiang.hou@xxxxxxx> > wrote: > >>>> > >>>>> Hi Lorenzo, > >>>>> > >>>>> Thanks a lot for your comments! > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx> > >>>>>> Sent: 2020年9月28日 17:39 > >>>>>> To: Z.q. Hou <zhiqiang.hou@xxxxxxx> > >>>>>> Cc: Rob Herring <robh@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; > >>>>>> PCI <linux-pci@xxxxxxxxxxxxxxx>; Bjorn Helgaas > >>>>>> <bhelgaas@xxxxxxxxxx>; Gustavo Pimentel > >>>>>> <gustavo.pimentel@xxxxxxxxxxxx>; Michael Walle > >>>>>> <michael@xxxxxxxx>; Ard Biesheuvel <ardb@xxxxxxxxxx> > >>>>>> Subject: Re: [PATCH] PCI: dwc: Added link up check in map_bus of > >>>>>> dw_child_pcie_ops > >>>>>> > >>>>>> On Thu, Sep 24, 2020 at 04:24:47AM +0000, Z.q. Hou wrote: > >>>>>>> Hi Rob, > >>>>>>> > >>>>>>> Thanks a lot for your comments! > >>>>>>> > >>>>>>>> -----Original Message----- > >>>>>>>> From: Rob Herring <robh@xxxxxxxxxx> > >>>>>>>> Sent: 2020年9月18日 23:28 > >>>>>>>> To: Z.q. Hou <zhiqiang.hou@xxxxxxx> > >>>>>>>> Cc: linux-kernel@xxxxxxxxxxxxxxx; PCI > >>>>>>>> <linux-pci@xxxxxxxxxxxxxxx>; Lorenzo Pieralisi > >>>>>>>> <lorenzo.pieralisi@xxxxxxx>; Bjorn Helgaas > >>>>>>>> <bhelgaas@xxxxxxxxxx>; Gustavo Pimentel > >>>>>>>> <gustavo.pimentel@xxxxxxxxxxxx>; Michael Walle > >>>>>> <michael@xxxxxxxx>; > >>>>>>>> Ard Biesheuvel <ardb@xxxxxxxxxx> > >>>>>>>> Subject: Re: [PATCH] PCI: dwc: Added link up check in map_bus > >>>>>>>> of dw_child_pcie_ops > >>>>>>>> > >>>>>>>> On Fri, Sep 18, 2020 at 5:02 AM Z.q. Hou > <zhiqiang.hou@xxxxxxx> > >>>>>> wrote: > >>>>>>>>> > >>>>>>>>> Hi Rob, > >>>>>>>>> > >>>>>>>>> Thanks a lot for your comments! > >>>>>>>>> > >>>>>>>>>> -----Original Message----- > >>>>>>>>>> From: Rob Herring <robh@xxxxxxxxxx> > >>>>>>>>>> Sent: 2020年9月17日 4:29 > >>>>>>>>>> To: Z.q. Hou <zhiqiang.hou@xxxxxxx> > >>>>>>>>>> Cc: linux-kernel@xxxxxxxxxxxxxxx; PCI > >>>>>>>>>> <linux-pci@xxxxxxxxxxxxxxx>; Lorenzo Pieralisi > >>>>>>>>>> <lorenzo.pieralisi@xxxxxxx>; Bjorn Helgaas > >>>>>>>>>> <bhelgaas@xxxxxxxxxx>; Gustavo Pimentel > >>>>>>>>>> <gustavo.pimentel@xxxxxxxxxxxx>; Michael Walle > >>>>>>>> <michael@xxxxxxxx>; > >>>>>>>>>> Ard Biesheuvel <ardb@xxxxxxxxxx> > >>>>>>>>>> Subject: Re: [PATCH] PCI: dwc: Added link up check in map_bus > >>>>>>>>>> of dw_child_pcie_ops > >>>>>>>>>> > >>>>>>>>>> On Tue, Sep 15, 2020 at 11:49 PM Zhiqiang Hou > >>>>>>>> <Zhiqiang.Hou@xxxxxxx> > >>>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>> From: Hou Zhiqiang <Zhiqiang.Hou@xxxxxxx> > >>>>>>>>>>> > >>>>>>>>>>> On NXP Layerscape platforms, it results in SError in the > >>>>>>>>>>> enumeration of the PCIe controller, which is not connecting > >>>>>>>>>>> with an Endpoint device. And it doesn't make sense to > >>>>>>>>>>> enumerate the Endpoints when the PCIe link is down. So this > >>>>>>>>>>> patch added the link up check to avoid to fire configuration > >>>>>> transactions on link down bus. > >>>>>>>>>> > >>>>>>>>>> Michael reported the same issue as well. > >>>>>>>>>> > >>>>>>>>>> What happens if the link goes down between the check and the > >>>>>> access? > >>>>>>>>> > >>>>>>>>> This patch cannot cover this case, and will get the SError. > >>>>>>>>> But I think it makes sense to avoid firing transactions on link > down bus. > >>>>>>>> > >>>>>>>> That's impossible to do without a race even in h/w. > >>>>>>> > >>>>>>> Agree. > >>>>>>> > >>>>>>>> > >>>>>>>>>> It's a racy check. I'd like to find an alternative solution. > >>>>>>>>>> It's even worse if Layerscape is used in ECAM mode. I looked > >>>>>>>>>> at the EDK2 setup for layerscape[1] and it looks like root > >>>>>>>>>> ports are just skipped if link > >>>>>>>> is down. > >>>>>>>>>> Maybe a link down just never happens once up, but if so, then > >>>>>>>>>> we only need to check it once and fail probe. > >>>>>>>>> > >>>>>>>>> Many customers connect the FPGA Endpoint, which may > establish > >>>>>>>>> PCIe link after the PCIe enumeration and then rescan the PCIe > >>>>>>>>> bus, so I think it should not exit the probe of root port even > >>>>>>>>> if there is not link up > >>>>>>>> during enumeration. > >>>>>>>> > >>>>>>>> That's a good reason. I want to unify the behavior here as it > >>>>>>>> varies per platform currently and wasn't sure which way to go. > >>>>>>>> > >>>>>>>> > >>>>>>>>>> I've dug into this a bit more and am curious about the > >>>>>>>>>> PCIE_ABSERR register setting which is set to: > >>>>>>>>>> > >>>>>>>>>> #define PCIE_ABSERR_SETTING 0x9401 /* Forward error of > >>>>>>>>>> non-posted request */ > >>>>>>>>>> > >>>>>>>>>> It seems to me this is not what we want at least for config > >>>>>>>>>> accesses, but commit 84d897d6993 where this was added > seems > >>>>>>>>>> to say otherwise. Is it not possible to configure the > >>>>>>>>>> response per access > >>>>>> type? > >>>>>>>>> > >>>>>>>>> Thanks a lot for your investigation! > >>>>>>>>> The story is like this: Some customers worry about these > >>>>>>>>> silent error (DWC PCIe IP won't forward the error of outbound > >>>>>>>>> non-post request by default), so we were pushed to enable the > >>>>>>>>> error forwarding to AXI in the commit > >>>>>>>>> 84d897d6993 as you saw. But it cannot differentiate the config > >>>>>>>>> transactions from the MEM_rd, except the Vendor ID access, > >>>>>>>>> which is controlled by a separate bit and it was set to not > >>>>>>>>> forward error of access > >>>>>>>> of Vendor ID. > >>>>>>>>> So we think it's okay to enable the error forwarding, the > >>>>>>>>> SError should not occur, because after the enumeration it > >>>>>>>>> won't access the > >>>>>>>> non-existent functions. > >>>>>>>> > >>>>>>>> We've rejected upstream support for platforms aborting on > >>>>>>>> config accesses[1]. I think there's clear consensus that > >>>>>>>> aborting is the wrong behavior. > >>>>>>>> > >>>>>>>> Do MEM_wr errors get forwarded? Seems like that would be > enough. > >>>>>>>> Also, wouldn't page faults catch most OOB accesses anyways? > You > >>>>>>>> need things page aligned anyways with an IOMMU and doing > >>>>>>>> userspace access or guest assignment. > >>>>>>> > >>>>>>> Yes, errors of MEM_wr can be forwarded. > >>>>>>> > >>>>>>>> > >>>>>>>> Here's another idea, how about only enabling forwarding errors > >>>>>>>> if the link is up? If really would need to be configured any > >>>>>>>> time the link state changes rather than just at probe. I'm not > >>>>>>>> sure if you have a way to disable it on link down though. > >>>>>>> > >>>>>>> Dug deeper into this issue and found the setting of not > >>>>>>> forwarding error of non-existent Vender ID access counts on the > link partner: 1. > >>>>>>> When there is a link partner (namely link up), it will return > >>>>>>> 0xffff when read non-existent function Vendor ID and won't > >>>>>>> forward error to AXI. 2. When no link partner (link down), it > >>>>>>> will forward the error of reading non-existent function Vendor > >>>>>>> ID to AXI and result in SError. > >>>>>>> > >>>>>>> I think this is a DWC PCIe IP specific issue but not get > >>>>>>> feedback from design team. I'm thinking to disable this error > >>>>>>> forwarding just like other platforms, since when these errors > >>>>>>> (UR, CA and CT) are detected, AER driver can also report the error > and try to recover. > >>>>>> > >>>>>> I take this as you shall send a patch to fix this issue shortly, is this > correct ? > >>>>> > >>>>> The issue becomes complex: > >>>>> I reviewed the DWC PCIe databook of verion 4.40a which is used on > Layerscape platforms, and it said that " Your RC application should not > generate CFG requests until it has confirmed that the link is up by sampling > the smlh_link_up and rmlh_link_up outputs". > >>>>> So, the link up checking should not be remove before each outbound > CFG access. > >>>>> Gustavo, can you share more details on the link up checking? Does it > only exist in the 4.40a? > >>>> > >>>> Hi Zhiqiang, > >>>> > >>>> According to the information that I got from the IP team you are > >>>> correct, the same requirement still exists on the newer IP versions. > >>> > >>> How is that possible in a race free way? > >>> > >>> Testing on meson and layerscape (with the forwarding of errors > >>> disabled) shows a link check is not needed. But then dra7xx seems to > >>> need one (or has some f/w setup). > >> > >> Yeah, I don't see any registers in the DRA7x PCIe wrapper for > >> disabling error forwarding. > > > > It's a DWC port logic register AFAICT, but perhaps not present in all > versions. > > Okay. I see there's a register PCIECTRL_PL_AXIS_SLV_ERR_RESP which has a > reset value of 0. > > It has four bit-fields, RESET_TIMEOUT_ERR_MAP, NO_VID_ERR_MAP, > DBI_ERR_MAP and SLAVE_ERR_MAP. I'm not seeing any difference in > behavior if I set all these bits. Maybe it requires platform support too. I'll > check this with our design team. In DWC v4.40a databook, there is a bit AMBA_ERROR_RESPONSE_GLOBAL which controls if enable the error forwarding. The *MAP bits only determine which error (SLVERR or DECERR) will be forwarded to AXI/AHB bus. Thanks, Zhiqiang > > Meanwhile would it be okay to add linkup check atleast for DRA7X so that > we could have it booting in linux-next? > > Thanks > Kishon