On Thu, 12 Dec 2019 21:07:24 +0000 Andrew Murray <andrew.murray@xxxxxxx> wrote: Hi, > On Tue, Dec 10, 2019 at 08:41:15AM -0600, Bjorn Helgaas wrote: > > On Mon, Dec 09, 2019 at 04:06:38PM +0000, Andre Przywara wrote: [ ... ] > > Even ECAM compliance is not really minor -- if this controller were > > fully compliant with the spec, you would need ZERO Linux changes to > > support it. Every quirk like this means additional maintenance > > burden, and it's not just a one-time thing. It means old kernels that > > *should* "just work" on your system will not work unless somebody > > backports the quirk. > > With regards to URs resulting in unwanted aborts or similar - this seems > to be a very common theme amongst ARM PCI controller drivers. For example > both ARM32 imx6 and ARM32 keystone have fault handlers to handle an abort > and fabricate a 0xffffffff read value. > > The ARM32 rcar driver, whilst it doesn't appear to produce an abort, does > read the PCI_STATUS register after making a config read to determine if > any aborts have happened - in which case it reports > PCIBIOS_DEVICE_NOT_FOUND. > > And as recently reported [1], the rockchip driver also appears to produce > aborts. > > I suspect that this ARM64 controller driver won't be the last either. Thus > any solution here may form the basis of copy-cat solutions for subsequent > controllers. Well, I think Bjorn is aware of them, but was actually hoping that those broken controllers would go away at some point ;-) And just to make this clear: I would categorise this issue as an integration bug, which just can't be fixed in hardware or firmware easily. It was never meant to be this way. So I am not sure we should promote generic solutions here. > From my understanding of the issues, the ARM64 serrors are imprecise and > as a result there isn't a sensible way of using them to determine that a > read is a UR. So where there are no other solutions to suppress the > generation of an abort by the controller, the only solutions that seem to > exist are 1) pre-scan the devices in firmware and only talk to those devices > in Linux - a safe option but limiting - perhaps with side effects for CRS > and 2) the approach rcar takes in using the PCI_STATUS register - though > you'd end up having to mask the serror (PSTATE.A) for a limited period of > time - a risky option (you'll miss real serrors) - but with no side effects. > > (I don't know if option 2 is feasible in this case by the way). Interesting, we might evaluate this, but mostly out of curiosity or for debugging. I don't think it's really a better option. If there is a safe way of making this work in the majority of cases, that should be the way to go. Setting PSTATE.A sounds quite wacky to me. Thanks, Andre. > [1] https://lore.kernel.org/linux-pci/2a381384-9d47-a7e2-679c-780950cd862d@xxxxxxxxxxxxxx/2-0001-WFT-PCI-rockchip-play-game-with-unsupported-request-.patch > > Thanks, > > Andrew Murray > > > > > > This allows the Arm Neoverse N1SDP board to boot Linux without crashing > > > and to access *any* devices (there are no platform devices except UART).