On Sun, 2023-12-03 at 23:35 -0800, Marc MERLIN wrote: > So, I thought that maybe my custom built kernel had options that somehow > made P17 unhappy, and went to a stock debian kernel. > It's not really looking better with that kernel unfortunately :-/ > > Still seems unhappy with networking, first wireless and then ethtool. > Adding wireless lists to Cc just in case Well clearly something is not unlocking the RTNL, but digging through the below I only found places that want to acquire the RTNL and wait forever on it (including wireless), but none that actually got stuck while having it acquired already. Actually ... no that's wrong. I can: > > [ 363.945427] INFO: task powertop:6279 blocked for more than 120 seconds. > > [ 363.945446] Tainted: G U 6.6.3-amd64-preempt-sysrq-20220227 #4 > > [ 363.945452] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > [ 363.945456] task:powertop state:D stack:0 pid:6279 ppid:6267 flags:0x00004002 > > [ 363.945468] Call Trace: > > [ 363.945473] <TASK> > > [ 363.945481] __schedule+0xba0/0xc05 > > [ 363.945497] schedule+0x95/0xce > > [ 363.945504] schedule_preempt_disabled+0x15/0x22 > > [ 363.945511] __mutex_lock.constprop.0+0x18b/0x291 > > [ 363.945520] ? __pfx_pci_pm_runtime_resume+0x40/0x40 > > [ 363.945531] igc_resume+0x18b/0x1ca [igc 1a96e277f8878a2a3c9599226acd0eeb7de577b7] this is trying to acquire the RTNL, by looking at the code > > [ 363.945566] __rpm_callback+0x7a/0xe7 > > [ 363.945578] rpm_callback+0x35/0x64 > > [ 363.945587] ? __pfx_pci_pm_runtime_resume+0x40/0x40 > > [ 363.945592] rpm_resume+0x342/0x44a > > [ 363.945600] ? __kmem_cache_alloc_node+0x123/0x154 > > [ 363.945614] __pm_runtime_resume+0x5a/0x7a > > [ 363.945624] dev_ethtool+0x15a/0x24e7 but this already holds it So looks like bug in the 'igc' driver wrt. runtime PM locking. johannes