Huichun Feng <foxhoundsk.tw@xxxxxxxxx> 於 2025年3月4日 週二 下午2:32寫道: > > Hi, > > I'm on a Xilinx UltraScale+ MPSoC SoC, and the kernel is v6.6. > > It's a heterogeneous SoC including: > > - Cortex-A53*4 (APU, Application Processing Unit) > - Cortex-R5*2 (RPU, Real-time Processing Unit) > - MicroBlaze based platform management unit (PMU) > - MicroBlaze based configuration security unit (CSU) > > The SoC features a facility called Xilinx Peripheral Protection Unit > (XPPU), which prevents unintended access of GPIO (and the likes) from > particular processing units. > > In my case, I found that Linux running on APU (A53 cores) attempts to > probe the GPIO used by the RPU (R5 cores), which requires the RPU to > do GPIO init again after the probe. Given this, I employ XPPU to > prevent Linux from accessing the GPIO [0], which seemingly works since > Linux then panic'd after the provisioning of XPPU. Following is the > panic message: > > [ 3.627182] SError Interrupt on CPU0, code 0x00000000bf000002 -- SError > [ 3.627190] CPU: 0 PID: 32 Comm: kworker/u9:0 Not tainted > 6.6.40-xilinx-g2b7f6f70a62a #1 > [ 3.627197] Hardware name: ZynqMP ZCU102 Rev1.0 (DT) > [ 3.627201] Workqueue: events_unbound deferred_probe_work_func > [ 3.627216] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [ 3.627223] pc : zynq_gpio_probe+0x1fc/0x3b4 > [ 3.627232] lr : zynq_gpio_probe+0x190/0x3b4 > [ 3.627238] sp : ffff8000819d3b60 > [ 3.627240] x29: ffff8000819d3b60 x28: ffff0008001dd0c0 x27: ffff000800011498 > [ 3.627250] x26: ffff0008001dd100 x25: ffff000800011400 x24: ffff00080372b0c0 > [ 3.627258] x23: ffff0008000fcc00 x22: 0000000000000001 x21: ffff0008000fcc10 > [ 3.627266] x20: ffff000802ea6880 x19: ffff000802ea6940 x18: ffffffffffffffff > [ 3.627275] x17: ffff000800134c00 x16: ffff000800d3e000 x15: ffff8000819d3510 > [ 3.627284] x14: ffff00080002791c x13: ffff80008172b520 x12: 0000000000000019 > [ 3.627292] x11: ffff80008112cba0 x10: ffff84008349feaf x9 : 0000000000000028 > [ 3.627301] x8 : ffff00080372b120 x7 : 0000000000000000 x6 : 00000000552478d3 > [ 3.627309] x5 : 00000000ffffffff x4 : ffff800081d54000 x3 : ffff8000806a6d1c > [ 3.627317] x2 : 0000000000000000 x1 : ffff800081d54354 x0 : ffff0008000fcc10 > [ 3.627326] Kernel panic - not syncing: Asynchronous SError Interrupt > [ 3.627330] CPU: 0 PID: 32 Comm: kworker/u9:0 Not tainted 6.6.40 > -xilinx-g2b7f6f70a62a #1 > [ 3.627335] Hardware name: ZynqMP ZCU102 Rev1.0 (DT) > [ 3.627338] Workqueue: events_unbound deferred_probe_work_func > [ 3.627346] Call trace: > [ 3.627349] dump_backtrace+0x90/0xe8 > [ 3.627360] show_stack+0x18/0x24 > [ 3.627369] dump_stack_lvl+0x48/0x60 > [ 3.627379] dump_stack+0x18/0x24 > [ 3.627387] panic+0x314/0x370 > [ 3.627394] nmi_panic+0x8c/0x90 > [ 3.627401] arm64_serror_panic+0x6c/0x78 > [ 3.627407] do_serror+0x28/0x68 > [ 3.627413] el1h_64_error_handler+0x30/0x48 > [ 3.627423] el1h_64_error+0x64/0x68 > [ 3.627429] zynq_gpio_probe+0x1fc/0x3b4 > [ 3.627435] platform_probe+0x68/0xc4 > [ 3.627443] really_probe+0x148/0x2b0 > [ 3.627449] __driver_probe_device+0x78/0x12c > [ 3.627456] driver_probe_device+0xd8/0x15c > [ 3.627462] __device_attach_driver+0xb8/0x134 > [ 3.627468] bus_for_each_drv+0x88/0xe8 > [ 3.627473] __device_attach+0xa0/0x190 > [ 3.627480] device_initial_probe+0x14/0x20 > [ 3.627486] bus_probe_device+0xac/0xb0 > [ 3.627492] deferred_probe_work_func+0x88/0xc0 > [ 3.627498] process_one_work+0x138/0x28c > [ 3.627506] worker_thread+0x2a4/0x4bc > [ 3.627512] kthread+0xe0/0xe4 > [ 3.627519] ret_from_fork+0x10/0x20 > [ 3.627527] SMP: stopping secondary CPUs > [ 3.627533] Kernel Offset: disabled > [ 3.627535] CPU features: 0x0,00000008,00020000,0000420b > [ 3.627540] Memory Limit: none > [ 3.885271] ---[ end Kernel panic - not syncing: Asynchronous > SError Interrupt ]--- > > At the moment of the panic, which is just after the employment of the > XPPU, I thought that I should disable the GPIO in devicetree. However, > after the GPIO got disabled [1], the panic still present. Is it > because that, in this case, I can't simply disabling the GPIO through > adding 'status="disable";' ? It turns out that, even the removal of the heartbeat GPIO node doesn't prevent the GPIO of interest being probed (accessing the related registers). Instead, the whole GPIO module have to be disabled like this: &gpio { - status = "okay"; + status = "disabled"; pinctrl-names = "default"; pinctrl-0 = <&pinctrl_gpio_default>; }; in which "gpio" is defined in a dtsi as: gpio: gpio@ff0a0000 { compatible = "xlnx,zynqmp-gpio-1.0"; status = "disabled"; #gpio-cells = <0x2>; gpio-controller; interrupt-parent = <&gic>; interrupts = <GIC_SPI 16 IRQ_TYPE_LEVEL_HIGH>; interrupt-controller; #interrupt-cells = <2>; reg = <0x0 0xff0a0000 0x0 0x1000>; power-domains = <&zynqmp_firmware PD_GPIO>; }; Although turning off the whole GPIO module stops the kernel panic, other peripherals have warning messages like: [ 24.559979] ------------[ cut here ]------------ [ 24.570039] i2c1_ref already disabled [ 24.573722] WARNING: CPU: 0 PID: 62 at /drivers/clk/clk.c:1181 clk_core_disable+0xa4/0xac [ 24.581904] Modules linked in: cfg80211 uio_pdrv_genirq openvswitch nsh nf_nat [ 24.589137] CPU: 0 PID: 62 Comm: kworker/u9:2 Tainted: G W 6.6.40-xilinx-g2b7f6f70a62a-dirty #1 [ 24.599223] Hardware name: ZynqMP ZCU102 RevA (DT) [ 24.604005] Workqueue: events_unbound deferred_probe_work_func [ 24.609838] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 24.616799] pc : clk_core_disable+0xa4/0xac [ 24.620983] lr : clk_core_disable+0xa4/0xac [ 24.625158] sp : ffff800081cbbb10 [ 24.628465] x29: ffff800081cbbb10 x28: ffff0008018ef840 x27: 0000000000000000 [ 24.635602] x26: ffff0008018ef880 x25: ffff8000816f0000 x24: 00000000000000ff [ 24.642738] x23: ffff000807431d08 x22: ffff0008000f9410 x21: 0000000000000000 [ 24.649873] x20: ffff000801e23000 x19: ffff000801e23000 x18: 0000000000000006 [ 24.657009] x17: 0000000000000016 x16: ffff800081cbbb30 x15: 0720072007200720 [ 24.664145] x14: 0720072007200720 x13: ffff8000816f31b0 x12: 000000000000088b [ 24.671280] x11: 00000000000002d9 x10: ffff80008171f1b0 x9 : ffff8000816f31b0 [ 24.678416] x8 : 00000000fffff7ff x7 : ffff80008171f1b0 x6 : 0000000000000001 [ 24.685552] x5 : ffff00087f75aa88 x4 : 0000000000000000 x3 : 0000000000000027 [ 24.692687] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000800afb700 [ 24.699823] Call trace: [ 24.702261] clk_core_disable+0xa4/0xac [ 24.706090] clk_disable+0x30/0x4c [ 24.709492] cdns_i2c_probe+0x3f4/0x534 [ 24.713321] platform_probe+0x68/0xc4 [ 24.716975] really_probe+0x148/0x2b0 [ 24.720630] __driver_probe_device+0x78/0x12c [ 24.724979] driver_probe_device+0xd8/0x15c [ 24.729155] __device_attach_driver+0xb8/0x134 [ 24.733591] bus_for_each_drv+0x88/0xe8 [ 24.737419] __device_attach+0xa0/0x190 [ 24.741247] device_initial_probe+0x14/0x20 [ 24.745423] bus_probe_device+0xac/0xb0 [ 24.749251] deferred_probe_work_func+0x88/0xc0 [ 24.753774] process_one_work+0x138/0x28c [ 24.757784] worker_thread+0x2a4/0x4bc [ 24.761525] kthread+0xe0/0xe4 [ 24.764572] ret_from_fork+0x10/0x20 [ 24.768141] ---[ end trace 0000000000000000 ]--- [ 24.773934] platform ina226-u76: deferred probe pending [ 24.784636] platform ina226-u77: deferred probe pending [ 24.789876] platform ina226-u78: deferred probe pending These peripherals are currently unused and I believe that removing them from devicetree can stop the warning message. Nevertheless, I'm figuring out whether it is feasible that the configuration granularity can be as fine-grained as per-GPIO-pin-registers, as this brings more verserality to the system capability. If anyone know there's way to isolate the GPIO at fine-grained level, please let me know. Thanks! Fox > > If there's any RTFM thing I should do beforehand, please provide me > some keywords or in-tree document names. > > [0] I believe this is simply an initial step since I would also need > to teach Linux not to use/probe this particular GPIO. > [1] I can assure that the GPIO does get disabled since it was a > heartbeat LED for Linux, which no longer beats/flashes after the > devicetree disable thing. > > Thanks! > Fox