On Fri, 30 Jul 2021 at 16:44, Robin Murphy <robin.murphy@xxxxxxx> wrote: > > On 2021-07-30 15:34, Anders Roxell wrote: > > On 2021-07-30 13:17, Robin Murphy wrote: > >> On 2021-07-30 12:35, Anders Roxell wrote: > >>> From: Robin Murphy <robin.murphy@xxxxxxx> > >>> > >>>> Now that PCI inbound window restrictions are handled generically between > >>>> the of_pci resource parsing and the IOMMU layer, and described in the > >>>> Juno DT, we can finally enable the PCIe SMMU without the risk of DMA > >>>> mappings inadvertently allocating unusable addresses. > >>>> > >>>> Similarly, the relevant support for IOMMU mappings for peripheral > >>>> transfers has been hooked up in the pl330 driver for ages, so we can > >>>> happily enable the DMA SMMU without that breaking anything either. > >>>> > >>>> Signed-off-by: Robin Murphy <robin.murphy@xxxxxxx> > >>> > >>> When we build a kernel with 64k page size and run the ltp syscalls we > >>> sporadically see a kernel crash while doing a mkfs on a connected SATA > >>> drive. This is happening every third test run on any juno-r2 device in > >>> the lab with the same kernel image (stable-rc 5.13.y, mainline and next) > >>> with gcc-11. > >> > >> Hmm, I guess 64K pages might make a difference in that we'll chew through > >> IOVA space a lot faster with small mappings... > >> > >> I'll have to try to reproduce this locally, since the interesting thing > >> would be knowing what DMA address it was trying to use that went wrong, but > >> IOMMU tracepoints and/or dma-debug are going to generate an crazy amount of > >> data to sift through and try to correlate - having done it before it's not > >> something I'd readily ask someone else to do for me :) > >> > >> On a hunch, though, does it make any difference if you remove the first > >> entry from the PCIe "dma-ranges" (the 0x2c1c0000 one)? > > > > I did this change, and run the job 7 times and could not reproduce the > > issue. > > Thanks! And hold that thought; if it works then I suspect it probably is > the best fix, but I'll double-check and write it up properly next week. I just want to send a friendly reminder to this issue, since I haven't seen a patch for this. We still see the issue on v5.13.y and above. Or have I missed anything? Cheers, Anders > > Cheers, > Robin. > > > diff --git a/arch/arm64/boot/dts/arm/juno-base.dtsi b/arch/arm64/boot/dts/arm/juno-base.dtsi > > index 8e7a66943b01..d3148730e951 100644 > > --- a/arch/arm64/boot/dts/arm/juno-base.dtsi > > +++ b/arch/arm64/boot/dts/arm/juno-base.dtsi > > @@ -545,8 +545,7 @@ pcie_ctlr: pcie@40000000 { > > <0x02000000 0x00 0x50000000 0x00 0x50000000 0x0 0x08000000>, > > <0x42000000 0x40 0x00000000 0x40 0x00000000 0x1 0x00000000>; > > /* Standard AXI Translation entries as programmed by EDK2 */ > > - dma-ranges = <0x02000000 0x0 0x2c1c0000 0x0 0x2c1c0000 0x0 0x00040000>, > > - <0x02000000 0x0 0x80000000 0x0 0x80000000 0x0 0x80000000>, > > + dma-ranges = <0x02000000 0x0 0x80000000 0x0 0x80000000 0x0 0x80000000>, > > <0x43000000 0x8 0x00000000 0x8 0x00000000 0x2 0x00000000>; > > #interrupt-cells = <1>; > > interrupt-map-mask = <0 0 0 7>; > > > > > > Cheers, > > Anders > >