Hi Krzysztof, (CC corrected) This patch is now commit 1ff54f4cbaed9ec6 ("PCI: dwc: Add debugfs based Silicon Debug support for DWC") in pci/next (next-20250304). On Mon, 3 Mar 2025 at 20:47, Krzysztof Wilczyński <kw@xxxxxxxxx> wrote: > [...] > > > +int dwc_pcie_debugfs_init(struct dw_pcie *pci) > > > +{ > > > + char dirname[DWC_DEBUGFS_BUF_MAX]; > > > + struct device *dev = pci->dev; > > > + struct debugfs_info *debugfs; > > > + struct dentry *dir; > > > + int ret; > > > + > > > + /* Create main directory for each platform driver */ > > > + snprintf(dirname, DWC_DEBUGFS_BUF_MAX, "dwc_pcie_%s", dev_name(dev)); > > > + dir = debugfs_create_dir(dirname, NULL); > > > + debugfs = devm_kzalloc(dev, sizeof(*debugfs), GFP_KERNEL); > > > + if (!debugfs) > > > + return -ENOMEM; > > > + > > > + debugfs->debug_dir = dir; > > > + pci->debugfs = debugfs; > > > + ret = dwc_pcie_rasdes_debugfs_init(pci, dir); > > > + if (ret) > > > + dev_dbg(dev, "RASDES debugfs init failed\n"); > > > > What will happen if ret != 0? still return 0? And that is exactly what happens on Gray Hawk Single with R-Car V4M: dw_pcie_find_rasdes_capability() returns NULL, causing dwc_pcie_rasdes_debugfs_init() to return -ENODEV. > Given that callers of dwc_pcie_debugfs_init() check for errors, Debugfs issues should never be propagated upstream! > this probably should correctly bubble up any failure coming from > dwc_pcie_rasdes_debugfs_init(). > > I made updates to the code directly on the current branch, have a look: So while applying, you changed this like: ret = dwc_pcie_rasdes_debugfs_init(pci, dir); - if (ret) - dev_dbg(dev, "RASDES debugfs init failed\n"); + if (ret) { + dev_err(dev, "failed to initialize RAS DES debugfs\n"); + return ret; + } return 0; Hence this is now a fatal error, causing the probe to fail. Unfortunately something fails during cleanup: pcie-rcar-gen4 e65d0000.pcie: failed to initialize RAS DES debugfs ------------[ cut here ]------------ WARNING: CPU: 3 PID: 36 at kernel/irq/irqdomain.c:393 irq_domain_remove+0xa8/0xb0 CPU: 3 UID: 0 PID: 36 Comm: kworker/u16:1 Not tainted 6.14.0-rc1-arm64-renesas-00134-g12c8c1363538 #2884 Hardware name: Renesas Gray Hawk Single board based on r8a779h0 (DT) Workqueue: async async_run_entry_fn pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : irq_domain_remove+0xa8/0xb0 lr : irq_domain_remove+0x2c/0xb0 sp : ffff8000819b3b10 x29: ffff8000819b3b10 x28: 0000000000000000 x27: 0000000000000000 x26: ffff00044011d800 x25: ffff80008053294c x24: ffff000441740400 x23: ffff0004413a30f0 x22: ffff0004413a3130 x21: ffff0004413a3130 x20: ffff8000815c0ec8 x19: ffff0004415f8240 x18: 00000000ffffffff x17: 6775626564205345 x16: 0000000000000020 x15: ffff8000819b37b0 x14: 0000000000000004 x13: ffff8000813e9dd8 x12: 0000000000000000 x11: ffff0004404b6448 x10: ffff800080e85400 x9 : 1fffe00088334301 x8 : 0000000000000001 x7 : ffff0004419a1800 x6 : ffff0004419a1808 x5 : ffff000441349030 x4 : fffffffffffffdc1 x3 : 0000000000000000 x2 : ffff0004403e0000 x1 : 0000000000000000 x0 : ffff00044134f630 Call trace: irq_domain_remove+0xa8/0xb0 (P) dw_pcie_host_init+0x394/0x710 rcar_gen4_pcie_probe+0x104/0x2f8 platform_probe+0x64/0xbc really_probe+0xb8/0x294 __driver_probe_device+0x74/0x124 driver_probe_device+0x3c/0x158 __device_attach_driver+0xd4/0x154 bus_for_each_drv+0x84/0xe0 __device_attach_async_helper+0xac/0xd0 async_run_entry_fn+0x30/0xd8 process_one_work+0x144/0x280 worker_thread+0x2c4/0x3cc kthread+0x128/0x1e0 ret_from_fork+0x10/0x20 ---[ end trace 0000000000000000 ]--- Worse, the PCI bus is still registered, so running "lspci" causes an Oops: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000004 Mem abort info: ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=0000000483b53000 [0000000000000004] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP CPU: 3 UID: 0 PID: 707 Comm: lspci Tainted: G W6.14.0-rc1-arm64-renesas-00134-g12c8c1363538 #2884 Tainted: [W]=WARN Hardware name: Renesas Gray Hawk Single board based on r8a779h0 (DT) pstate: 204000c5 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : pci_generic_config_read+0x34/0xac lr : pci_generic_config_read+0x20/0xac sp : ffff8000825cbbf0 x29: ffff8000825cbbf0 x28: ffff0004491c4b84 x27: 0000000000000004 x26: 000000000000000f x25: ffff0004491c4b80 x24: 0000000000000040 x23: 0000000000000040 x22: ffff8000825cbc64 x21: ffff8000816fb4f8 x20: ffff8000825cbc14 x19: 0000000000000004 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 x8 : 0000000000000000 x7 : ffff000442c653c0 x6 : ffff8000805163d0 x5 : ffff8000804f3334 x4 : ffff8000825cbc14 x3 : ffff800080531990 x2 : 0000000000000004 x1 : 0000000000000000 x0 : 0000000000000004 Call trace: pci_generic_config_read+0x34/0xac (P) pci_user_read_config_dword+0x78/0x118 pci_read_config+0xe4/0x29c sysfs_kf_bin_read+0x8c/0x9c kernfs_fop_read_iter+0x9c/0x19c vfs_read+0x24c/0x330 __arm64_sys_pread64+0xac/0xc8 invoke_syscall+0x44/0x100 el0_svc_common.constprop.0+0x3c/0xd4 do_el0_svc+0x18/0x20 el0_svc+0x24/0xa8 el0t_64_sync_handler+0x104/0x130 el0t_64_sync+0x154/0x158 Code: 7100067f 540002a0 71000a7f 54000160 (b9400000) ---[ end trace 0000000000000000 ]--- note: lspci[707] exited with irqs disabled note: lspci[707] exited with preempt_count 1 Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds