Re: Kernel panic apparently when writing on RPMsg TTY while or after coprocessor is stopped

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi David,

On Mon, Feb 05, 2024 at 09:35:41AM -0600, David Hess wrote:
> [ resending as plaintext so it makes it to the list ]
> 
> Experienced this kernel panic when stopping a coprocessor while the RPMsg tty was still open and being written too:
> 
> [25016.237134] Unable to handle kernel paging request at virtual address ffff800015b3a002
> [25016.245244] Mem abort info:
> [25016.248053]   ESR = 0x0000000096000007
> [25016.251824]   EC = 0x25: DABT (current EL), IL = 32 bits
> [25016.257140]   SET = 0, FnV = 0
> [25016.260216]   EA = 0, S1PTW = 0
> [25016.263363]   FSC = 0x07: level 3 translation fault
> [25016.268242] Data abort info:
> [25016.271147]   ISV = 0, ISS = 0x00000007
> [25016.274991]   CM = 0, WnR = 0
> [25016.277960] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000049c2d000
> [25016.284680] [ffff800015b3a002] pgd=10000000bffff003, p4d=10000000bffff003, pud=10000000bfffe003, pmd=1000000075692003, pte=0000000000000000
> [25016.297273] Internal error: Oops: 96000007 [#1] PREEMPT SMP
> [25016.302859] Modules linked in: rpmsg_ctrl rpmsg_char imx_rpmsg_tty xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c xt_addrtype iptable_filter ip_tables x_tables br_netfilter bridge stp llc mwifiex_sdio mwifiex bnep overlay cfg80211 mcp251xfd can_dev cm
> [25016.356332] CPU: 1 PID: 95780 Comm: python Tainted: G           O      5.15.129-6.4.0+git.67c3153d20ff #1-TorizonCore
> [25016.366955] Hardware name: Toradex Verdin iMX8M Mini WB on Yavia Board (DT)
> [25016.373924] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [25016.380891] pc : virtqueue_get_buf_ctx_split+0x28/0x180
> [25016.386132] lr : virtqueue_get_buf+0x30/0x40
> [25016.390411] sp : ffff800015db3a80
> [25016.393727] x29: ffff800015db3a80 x28: ffff80000a7022a0 x27: 0000000000000007
> [25016.400870] x26: ffff0000077dec00 x25: ffff00000e76c0c0 x24: ffff00000709bf00
> [25016.408015] x23: 0000000000000007 x22: 0000000000000100 x21: ffff0000014e1f40
> [25016.415162] x20: ffff0000014e1f00 x19: ffff000006c3cd00 x18: 0000000000000000
> [25016.422306] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffa5db3fb0
> [25016.429452] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
> [25016.436596] x11: 0000000000000000 x10: 0000000000000000 x9 : ffff800015db3eb0
> [25016.443742] x8 : 0000000000000000 x7 : 0000000000000000 x6 : ffff0000075c6e40
> [25016.450888] x5 : 0000000000000001 x4 : ffff800015db3ae0 x3 : ffff0000014e1f40
> [25016.458033] x2 : 0000000000000000 x1 : 00000000000002cf x0 : ffff800015b3a000
> [25016.465182] Call trace:
> [25016.467631]  virtqueue_get_buf_ctx_split+0x28/0x180
> [25016.472515]  virtqueue_get_buf+0x30/0x40
> [25016.476441]  rpmsg_send_offchannel_raw+0x44c/0x4f0
> [25016.481240]  virtio_rpmsg_send+0x28/0x34
> [25016.485167]  rpmsg_send+0x20/0x40
> [25016.488488]  rpmsgtty_write+0x54/0xb0 [imx_rpmsg_tty]

I can't find either rpmsgtty_write() or imx_rpmsg_tty() in the kernel tree - is this
code public?

> [25016.493551]  n_tty_write+0x2c0/0x48c
> [25016.497134]  file_tty_write.constprop.0+0x130/0x294
> [25016.502016]  tty_write+0x14/0x20
> [25016.505248]  new_sync_write+0xec/0x18c
> [25016.509004]  vfs_write+0x24c/0x2b0
> [25016.512409]  ksys_write+0x6c/0x100
> [25016.515817]  __arm64_sys_write+0x1c/0x30
> [25016.519744]  invoke_syscall+0x48/0x114
> [25016.523499]  el0_svc_common.constprop.0+0xd4/0xfc
> [25016.528209]  do_el0_svc+0x28/0xa0
> [25016.531526]  el0_svc+0x28/0x80
> [25016.534589]  el0t_64_sync_handler+0xa4/0x130
> [25016.538863]  el0t_64_sync+0x1a0/0x1a4
> [25016.542533] Code: 35000700 f9403660 aa0103e4 79409261 (79400400) 
> [25016.548634] ---[ end trace bc845368ab15e73f ]---
> [25016.553257] Kernel panic - not syncing: Oops: Fatal exception
> [25016.559009] SMP: stopping secondary CPUs
> [25016.563249] Kernel Offset: disabled
> [25016.566739] CPU features: 0x0,00002001,20000846
> [25016.571276] Memory Limit: none                                                                                                                              [25016.574336] Rebooting in 5 seconds.. 
> 
> I think the simple and obvious answer is “don’t do that” - we should be able to safely ensure the RPMsg TTY is closed before attempting to stop the coprocessor. However, it would be nice if the driver handled this situation safely regardless.
> 
> This was experienced under the TorizonCore 6.4 distribution on a Toradex Verdin iMX8M Mini WB on Yavia Board with this kernel:
> 
> 5.15.129-6.4.0+git.67c3153d20ff #1-TorizonCore SMP PREEMPT Wed Sep 27 12:30:36 UTC 2023
> 
> Happy to provide more information as needed. In terms of recreating, I think it’s as simple as opening the RPMsg TTY (with receptive firmware running on the coprocessor), writing to it at a high frequency and then stopping the coprocessor until it happens. We’ve seen this panic a few times and eventually managed to capture this panic log.
> 
> Dave
> 
> --
> David K. Hess
> Founder, Data Bakery | Data-Bakery.com
> dhess@xxxxxxxxxxxxxxx | LinkedIn
> +1 214-684-5448
> 




[Index of Archives]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Photo Sharing]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux