> Subject: Re: Network do not works with linux >= 6.1.2. Issue bisected to > "425c9bd06b7a70796d880828d15c11321bdfb76d" (RDMA/irdma: Report the > correct link speed) > > On Fri, Jan 06, 2023 at 08:55:29AM +0100, Jaroslav Pulchart wrote: > > [ 257.967099] task:NetworkManager state:D stack:0 pid:3387 > > ppid:1 flags:0x00004002 > > [ 257.975446] Call Trace: > > [ 257.977901] <TASK> > > [ 257.980004] __schedule+0x1eb/0x630 [ 257.983498] > > schedule+0x5a/0xd0 [ 257.986641] schedule_timeout+0x11d/0x160 [ > > 257.990654] __wait_for_common+0x90/0x1e0 [ 257.994666] ? > > usleep_range_state+0x90/0x90 [ 257.998854] > > __flush_workqueue+0x13a/0x3f0 [ 258.002955] ? > > __kernfs_remove.part.0+0x11e/0x1e0 > > [ 258.007661] ib_cache_cleanup_one+0x1c/0xe0 [ib_core] [ > > 258.012721] __ib_unregister_device+0x62/0xa0 [ib_core] [ 258.017959] > > ib_unregister_device+0x22/0x30 [ib_core] [ 258.023024] > > irdma_remove+0x1a/0x60 [irdma] [ 258.027223] > > auxiliary_bus_remove+0x18/0x30 [ 258.031414] > > device_release_driver_internal+0x1aa/0x230 > > [ 258.036643] bus_remove_device+0xd8/0x150 [ 258.040654] > > device_del+0x18b/0x3f0 [ 258.044149] ice_unplug_aux_dev+0x42/0x60 > > [ice] > > We talked about this already - wasn't it on this series? This is yet another path (when ice ports are added to a bond) I believe where the RDMA aux device is removed holding the RTNL lock. It's being exposed now with this recent irdma patch - 425c9bd06b7a, causing a deadlock. ice_lag_event_handler [rtnl_lock] ->ice_lag_changeupper_event ->ice_unplug_aux_dev ->irdma_remove ->ib_unregister_device ->ib_cache_cleanup_one ->flush_workqueue(ib) ->irdma_query_port -> ib_get_eth_speed [rtnl_lock] Previous discussion was on ethtool channel config change, https://lore.kernel.org/linux-rdma/Y5ES3kmYSINlAQhz@x130/, which David E. is taking care of. We are working on a patch for this issue. > > Don't hold locks when removing aux devices. > > > [ 258.048707] ice_lag_changeupper_event+0x287/0x2a0 [ice] [ > > 258.054038] ice_lag_event_handler+0x51/0x130 [ice] [ 258.058930] > > raw_notifier_call_chain+0x41/0x60 [ 258.063381] > > __netdev_upper_dev_link+0x1a0/0x370 > > [ 258.068008] netdev_master_upper_dev_link+0x3d/0x60 > > [ 258.072886] bond_enslave+0xd16/0x16f0 [bonding] [ 258.077517] ? > > nla_put+0x28/0x40 [ 258.080756] do_setlink+0x26c/0xc10 [ > > 258.084249] ? avc_alloc_node+0x27/0x180 [ 258.088173] ? > > __nla_validate_parse+0x141/0x190 [ 258.092708] > > __rtnl_newlink+0x53a/0x620 [ 258.096549] rtnl_newlink+0x44/0x70 > > Especially not the rtnl. > > Jason