Adding some people in the loop, maybe they could help in understanding why lack of "dma-coherent" property for a HW-coherent device could lead to unexpected / strange side effects. On 3/1/2021 5:22 PM, Sascha Hauer wrote: > Hi All, > > I am on a Layerscape LS1046a using Linux-5.11. The CAAM driver sometimes > crashes during the run-time self tests with: > >> kernel BUG at drivers/crypto/caam/jr.c:247! >> Internal error: Oops - BUG: 0 [#1] PREEMPT SMP >> Modules linked in: >> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.11.0-20210225-3-00039-g434215968816-dirty #12 >> Hardware name: TQ TQMLS1046A SoM on Arkona AT1130 (C300) board (DT) >> pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--) >> pc : caam_jr_dequeue+0x98/0x57c >> lr : caam_jr_dequeue+0x98/0x57c >> sp : ffff800010003d50 >> x29: ffff800010003d50 x28: ffff8000118d4000 >> x27: ffff8000118d4328 x26: 00000000000001f0 >> x25: ffff0008022be480 x24: ffff0008022c6410 >> x23: 00000000000001f1 x22: ffff8000118d4329 >> x21: 0000000000004d80 x20: 00000000000001f1 >> x19: 0000000000000001 x18: 0000000000000020 >> x17: 0000000000000000 x16: 0000000000000015 >> x15: ffff800011690230 x14: 2e2e2e2e2e2e2e2e >> x13: 2e2e2e2e2e2e2020 x12: 3030303030303030 >> x11: ffff800011700a38 x10: 00000000fffff000 >> x9 : ffff8000100ada30 x8 : ffff8000116a8a38 >> x7 : 0000000000000001 x6 : 0000000000000000 >> x5 : 0000000000000000 x4 : 0000000000000000 >> x3 : 00000000ffffffff x2 : 0000000000000000 >> x1 : 0000000000000000 x0 : 0000000000001800 >> Call trace: >> caam_jr_dequeue+0x98/0x57c >> tasklet_action_common.constprop.0+0x164/0x18c >> tasklet_action+0x44/0x54 >> __do_softirq+0x160/0x454 >> __irq_exit_rcu+0x164/0x16c >> irq_exit+0x1c/0x30 >> __handle_domain_irq+0xc0/0x13c >> gic_handle_irq+0x5c/0xf0 >> el1_irq+0xb4/0x180 >> arch_cpu_idle+0x18/0x30 >> default_idle_call+0x3c/0x1c0 >> do_idle+0x23c/0x274 >> cpu_startup_entry+0x34/0x70 >> rest_init+0xdc/0xec >> arch_call_rest_init+0x1c/0x28 >> start_kernel+0x4ac/0x4e4 >> Code: 91392021 912c2000 d377d8c6 97f24d96 (d4210000) > > The driver iterates over the descriptors in the output ring and matches them > with the ones it has previously queued. If it doesn't find a matching > descriptor it complains with the BUG_ON() seen above. What I see sometimes is > that the address in the output ring is 0x0, the job status in this case is > 0x40000006 (meaning DECO Invalid KEY command). It seems that the CAAM doesn't > write the descriptor address to the output ring at least in some error cases. > When we don't have the descriptor address of the failed descriptor we have no > way to find it in the list of queued descriptors, thus we also can't find the > callback for that descriptor. This looks very unfortunate, anyone else seen > this or has an idea what to do about it? > > I haven't investigated yet which job actually fails and why. Of course that would > be my ultimate goal to find that out. > This looks very similar to an earlier report from Greg. He confirmed that adding "dma-coherent" property to the "crypto" DT node fixes the issue: https://lore.kernel.org/linux-crypto/74f664f5-5433-d322-4789-3c78bdb814d8@xxxxxxxxxx Patch rebased on v5.11 is at the bottom. Does it work for you too? What I don't understand (and the reason I've postponed upstreaming it) is _why_ exactly this patch is working. I would have expected that a HW-coherent device to work fine even without the "dma-coherent" DT property in the corresponding node. I've found what seems related discussions involving eSDHC, but still I am trying to figure out what's happening. I'd really appreciate a clarification on what could go wrong (e.g. interactions with SW-based cache management etc.): https://lore.kernel.org/linux-mmc/20190916171509.GG25745@xxxxxxxxxxxxxxxxxxxxx https://lore.kernel.org/lkml/20191010083503.250941866@xxxxxxxxxxxxxxxxxxx https://lore.kernel.org/linux-mmc/AM7PR04MB688507B5B4D84EB266738891F8320@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Thanks, Horia -- >8 -- Subject: [PATCH] arm64: dts: ls1046a: mark crypto engine dma coherent Crypto engine (CAAM) on LS1046A platform has support for HW coherency, mark accordingly the DT node. Signed-off-by: Horia Geantă <horia.geanta@xxxxxxx> --- arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi index 025e1f587662..6d4db3e021e8 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi @@ -325,6 +325,7 @@ ranges = <0x0 0x00 0x1700000 0x100000>; reg = <0x00 0x1700000 0x0 0x100000>; interrupts = <GIC_SPI 75 IRQ_TYPE_LEVEL_HIGH>; + dma-coherent; sec_jr0: jr@10000 { compatible = "fsl,sec-v5.4-job-ring",