Hi Horia,
On 10/3/20 10:00 pm, Horia Geantă wrote:
On 3/10/2020 8:43 AM, Greg Ungerer wrote:
Hi Andrey,
I am tracking down a caam driver problem, where it is dumping on startup
on a Layerscape 1046 based hardware platform. The dump typically looks
something like this:
------------[ cut here ]------------
kernel BUG at drivers/crypto/caam/jr.c:218!
Internal error: Oops - BUG: 0 [#1] SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.5.0-ac0 #1
Hardware name: Digi AnywhereUSB-8 (DT)
pstate: 40000005 (nZcv daif -PAN -UAO)
pc : caam_jr_dequeue+0x3f8/0x420
lr : tasklet_action_common.isra.17+0x144/0x180
sp : ffffffc010003df0
x29: ffffffc010003df0 x28: 0000000000000001
x27: 0000000000000000 x26: 0000000000000000
x25: ffffff8020aeba80 x24: 0000000000000000
x23: 0000000000000000 x22: ffffffc010ab4e51
x21: 0000000000000001 x20: ffffffc010ab4000
x19: ffffff8020a2ec10 x18: 0000000000000004
x17: 0000000000000001 x16: 6800f1f100000000
x15: ffffffc010de5000 x14: 0000000000000000
x13: ffffffc010de5000 x12: ffffffc010de5000
x11: 0000000000000000 x10: ffffff8073018080
x9 : 0000000000000028 x8 : 0000000000000000
x7 : 0000000000000000 x6 : ffffffc010a11140
x5 : ffffffc06b070000 x4 : 0000000000000008
x3 : ffffff8073018080 x2 : 0000000000000000
x1 : 0000000000000001 x0 : 0000000000000000
Call trace:
caam_jr_dequeue+0x3f8/0x420
tasklet_action_common.isra.17+0x144/0x180
tasklet_action+0x24/0x30
_stext+0x114/0x228
irq_exit+0x64/0x70
__handle_domain_irq+0x64/0xb8
gic_handle_irq+0x50/0xa0
el1_irq+0xb8/0x140
arch_cpu_idle+0x10/0x18
do_idle+0xf0/0x118
cpu_startup_entry+0x24/0x60
rest_init+0xb0/0xbc
arch_call_rest_init+0xc/0x14
start_kernel+0x3d0/0x3fc
Code: d3607c21 2a020002 aa010041 17ffff4d (d4210000)
---[ end trace ce2c4c37d2c89a99 ]---
Git bisecting this lead me to commit a1cf573ee95d ("crypto: caam -
select DMA address size at runtime") as the culprit.
I came across commit by Iuliana, 7278fa25aa0e ("crypto: caam -
do not reset pointer size from MCFGR register"). However that
doesn't fix this dumping problem for me (it does seem to occur
less often though). [NOTE: dump above generated with this
change applied].
I initially hit this dump on a linux-5.4, and it also occurs on
linux-5.5 for me.
Any thoughts?
Could you try the following patch?
It worked on my side.
Unfortunately I don't think it fixes the root cause,
the device should work fine (though slower) without the property.
DMA API violations (e.g. cacheline sharing) are a good candidate.
Yep, that definitely fixes it for me. Thanks!
Regards
Greg
--- >8 ---
Subject: [PATCH] arm64: dts: ls1046a: mark crypto engine dma coherent
Crypto engine (CAAM) on LS1046A platform has support for HW coherency,
mark accordingly the DT node.
Signed-off-by: Horia Geantă <horia.geanta@xxxxxxx>
---
arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi
index d4c1da3d4bde..9e8147ef1748 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi
@@ -244,6 +244,7 @@
ranges = <0x0 0x00 0x1700000 0x100000>;
reg = <0x00 0x1700000 0x0 0x100000>;
interrupts = <GIC_SPI 75 IRQ_TYPE_LEVEL_HIGH>;
+ dma-coherent;
sec_jr0: jr@10000 {
compatible = "fsl,sec-v5.4-job-ring",
--
2.17.1