Hello, Le 07/02/2022 à 09:04, Tony Lindgren a écrit : > Hi, > > * Romain Naour <romain.naour@xxxxxxxx> [220204 13:39]: >> It seems that the driver fail to read the UART_LCR register from >> omap8250_set_mctrl(): >> >> "lcr = serial_in(up, UART_LCR);" >> >> PC is at mem_serial_in+0x2c/0x30 >> LR is at omap8250_set_mctrl+0x48/0xb0 >> >> The problem only occurs with a -rt kernel, I tried with several kernel version: >> 5.10-rt, 5.15-rt and 5.17-rt. >> >> I'm not able to reproduce the issue with a standard kernel. > > Interesting, what's the exception you get with the -rt kernel? Is it an > unhandled external abort or something else? "asynchronous external abort" Unhandled fault: asynchronous external abort (0x1211) at 0x00000000 pgd = bfdf645d [00000000] *pgd=862cf003, *pmd=27df1b003 Internal error: : 1211 [#1] PREEMPT_RT SMP ARM Modules linked in: cmac algif_hash cbc aes_arm_bs crypto_simd cryptd algif_skcipher af_alg usbhid prueth xhci_plat_hcd irq_pruss_intc xhci_hcd usbcore pru_rproc icss_iep pvrsr vkm(O) omap_wdt phy_omap_usb2 ahci_platform libahci_platform omap_aes_driver pruss libahci libata dwc3 roles udc_core usb_common wl18xx wlcore mac80211 ti_vpe ti_sc ti_csc ti_vpdma dwc3_omap wlcore_sdio hci_uart btbcm bluetooth omap_des ecdh_generic libdes ecc omap_crypto omap_sham crypto_engine omap_remoteproc sch_fq_codel CPU: 0 PID: 377 Comm: gpsmon Tainted: G W O 5.10.87-rt59+ #97 Hardware name: Generic DRA74X (Flattened Device Tree) PC is at omap8250_set_mctrl+0x38/0xa0 LR is at omap8250_set_mctrl+0x38/0xa0 pc : [<c065f388>] lr : [<c065f388>] psr: 60000013 sp : c6327ca0 ip : c6327c74 fp : c4754500 r10: c6327f10 r9 : 00000000 r8 : c22698c8 r7 : ffffe000 r6 : c205ac40 r5 : 00000006 r4 : c12eccd8 r3 : fa06e00c r2 : 00000002 r1 : 00000003 r0 : 00000000 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 30c5387d Table: 859817c0 DAC: fffffffd Process gpsmon (pid: 377, stack limit = 0xaa83ac51) Stack: (0xc6327ca0 to 0xc6328000) Sometime after the trace, the kernel panic due to an "exception in interrupt" [<c024d894>] (__task_rq_lock) from [<c0253148>] (rt_mutex_setprio+0x54/0x4b8) [<c0253148>] (rt_mutex_setprio) from [<c0276374>] (task_blocks_on_rt_mutex+0x2a4/0x374) [<c0276374>] (task_blocks_on_rt_mutex) from [<c0ac7888>] (rt_spin_lock_slowlock_locked+0xb8/0x2c4) [<c0ac7888>] (rt_spin_lock_slowlock_locked) from [<c0ac7ae8>] (rt_spin_lock_slowlock+0x54/0x84) [<c0ac7ae8>] (rt_spin_lock_slowlock) from [<c0ac9524>] (rt_spin_lock+0x50/0x5c) [<c0ac9524>] (rt_spin_lock) from [<c0661034>] (omap8250_irq+0x48/0x350) [<c0661034>] (omap8250_irq) from [<c027e490>] (irq_forced_thread_fn+0x28/0x98) [<c027e490>] (irq_forced_thread_fn) from [<c027e830>] (irq_thread+0x12c/0x214) [<c027e830>] (irq_thread) from [<c024704c>] (kthread+0x18c/0x1dc) [<c024704c>] (kthread) from [<c0200140>] (ret_from_fork+0x14/0x34) I guess it's due to the previous issue in omap8250_set_mctrl(). > >> While looking at the git history, I noticed this commit [3] about "flakey idling >> of uarts and stop using swsup_sidle_act". >> >> So I removed the SYSC_QUIRK for uart IP revision 0x50411e03 and it fixed my issue. > > Hmm. > >> Is the SYSC_QUIRK for omap4 still needed ? Is it safe to remove it ? >> It seems this issue was introduced while dropping the legacy platform data >> (between 4.19 and 5.4 kernels). > > AFAIK it's still needed, but maybe we can disable it for am57xx though. Since I removed the quirk I have other issues while using the serial interface. I had once a backtrace related to omap_8250_rx_dma_flush with CONFIG_SERIAL_8250_DMA enabled. WARNING: CPU: 0 PID: 449 at drivers/tty/serial/8250/8250_omap.c:916 omap_8250_rx_dma_flush+0xec/0xf4 Modules linked in: cmac algif_hash aes_arm aes_generic algif_skcipher af_alg usbhid xhci_plat_hcd xhci_hcd usbcore irq_pruss_intc prueth pru_rproc icss_iep omap_wdt pvrsrvkm( O) phy_omap_usb2 ahci_platform libahci_platform omap_aes_driver pruss libahci libata dwc3 roles udc_core usb_common wl18xx wlcore mac80211 sha256_generic libsha256 sha256_arm cfg80211 ti_vp e ti_sc ti_csc ti_vpdma dwc3_omap wlcore_sdio hci_uart btbcm bluetooth omap_hdq omap_des ecdh_generic omap_crypto ecc wire libdes libaes omap_sham crypto_engine sch_fq_codel CPU: 0 PID: 449 Comm: irq/122-4806e00 Tainted: G O 5.10.87-rt59+ #91 Hardware name: Generic DRA74X (Flattened Device Tree) [<c020e19c>] (unwind_backtrace) from [<c0209ef0>] (show_stack+0x10/0x14) [<c0209ef0>] (show_stack) from [<c0b064b8>] (dump_stack+0x98/0xac) [<c0b064b8>] (dump_stack) from [<c0b02410>] (__warn+0xcc/0xe4) [<c0b02410>] (__warn) from [<c0b0248c>] (warn_slowpath_fmt+0x64/0xc8) [<c0b0248c>] (warn_slowpath_fmt) from [<c06ae5c4>] (omap_8250_rx_dma_flush+0xec/0xf4) [<c06ae5c4>] (omap_8250_rx_dma_flush) from [<c06b0610>] (omap8250_irq+0x34c/0x350) [<c06b0610>] (omap8250_irq) from [<c02836a0>] (irq_forced_thread_fn+0x28/0x98) [<c02836a0>] (irq_forced_thread_fn) from [<c0283a40>] (irq_thread+0x12c/0x214) [<c0283a40>] (irq_thread) from [<c0248d94>] (kthread+0x18c/0x1dc) [<c0248d94>] (kthread) from [<c0200140>] (ret_from_fork+0x14/0x34) Exception stack(0xc38b1fb0 to 0xc38b1ff8) 1fa0: 00000000 00000000 00000000 00000000 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 To ease investigation, I disabled CONFIG_SERIAL_8250_DMA for now. I noticed other side effect when opening the serial interface: omap8250 4806e000.serial: Errata i202: timedout 0 cpsw-switch 48484000.switch: cpts: obtain a time stamp timeout sched: RT throttling activated thermal thermal_zone5: failed to read out thermal zone (-121) It takes several seconds to open the serial interface, something hang somewhere in the kernel. Maybe there is something wrong with the smart-standby or smart-idle feature in the UART IP ? I'm not sure. Are you able to reproduce it ? Maybe on a IDK574 or a Beaglebone-AI board ? Best regards, Romain > > Regards, > > Tony