On 02/27/2018 01:42 PM, Ben Greear wrote:
On 02/27/2018 12:49 PM, Ben Greear wrote:
I notice I can reliably lock up the kernel if I rmmod ath10k while it is under
heavy tx/rx traffic. First, this causes the firmware to crash, and then right
after (or possibly during?) the related kernel threads deadlock.
This is with my hacked driver and hacked firmware. In particular, the
ath10k_debug_nop_dwork is something I added, though it is pretty trivial,
it does take the ar->conf_mutex. It appears blocked trying to get it.
It appears something is holding the ar->conf_mutex, but it is not clear to
me from the lockdep output what process actually holds it.
Anyone see a clue they could share?
Changing how I start/stop the nop_dwork stuff seems to have made the
problem go away, so I guess maybe that was the issue.
Ok, so problem still remains. The 'rmmod' process appears to be the
one that is really not making progress. Unfortunately, decoding
ath10k_pci_hif_stop+0x6f leads to some bitops.h inline, which doesn't
let me know where it is actually stuck... Off to do more debugging....
[ 4037.220992] rmmod D 0 20267 3050 0x00000080
[ 4037.220995] Call Trace:
[ 4037.220997] __schedule+0x407/0xb70
[ 4037.220999] ? _raw_spin_unlock_irqrestore+0x4e/0x70
[ 4037.221003] schedule+0x38/0x90
[ 4037.221005] schedule_timeout+0x224/0x580
[ 4037.221007] ? retint_kernel+0x2d/0x2d
[ 4037.221010] ? call_timer_fn+0x370/0x370
[ 4037.221015] msleep+0x34/0x40
[ 4037.221017] ? msleep+0x34/0x40
[ 4037.221021] ath10k_pci_hif_stop+0x6f/0xd0 [ath10k_pci]
[ 4037.221032] ath10k_core_stop+0x4d/0x90 [ath10k_core]
[ 4037.221038] ath10k_halt+0x14b/0x1f0 [ath10k_core]
[ 4037.221044] ath10k_stop+0x36/0x80 [ath10k_core]
[ 4037.221059] drv_stop+0x58/0x2d0 [mac80211]
[ 4037.221075] ieee80211_stop_device+0x3e/0x50 [mac80211]
[ 4037.221088] ieee80211_do_stop+0x501/0x880 [mac80211]
[ 4037.221092] ? dev_deactivate_many+0x2b2/0x2f0
[ 4037.221105] ieee80211_stop+0x15/0x20 [mac80211]
[ 4037.221107] __dev_close_many+0x93/0xe0
[ 4037.221110] dev_close_many+0x7d/0x120
[ 4037.221114] dev_close.part.85+0x36/0x50
[ 4037.221116] dev_close+0x15/0x20
[ 4037.221155] cfg80211_shutdown_all_interfaces+0x44/0xc0 [cfg80211]
[ 4037.221168] ieee80211_remove_interfaces+0x42/0x1c0 [mac80211]
[ 4037.221180] ieee80211_unregister_hw+0x45/0x130 [mac80211]
[ 4037.221187] ath10k_mac_unregister+0x14/0x60 [ath10k_core]
[ 4037.221193] ath10k_core_unregister+0x3a/0xa0 [ath10k_core]
[ 4037.221197] ath10k_pci_remove+0x2d/0x70 [ath10k_pci]
[ 4037.221200] pci_device_remove+0x34/0xb0
[ 4037.221203] device_release_driver_internal+0x158/0x210
[ 4037.221206] driver_detach+0x3b/0x80
[ 4037.221208] bus_remove_driver+0x53/0xd0
[ 4037.221210] driver_unregister+0x27/0x40
[ 4037.221213] pci_unregister_driver+0x24/0x90
[ 4037.221216] ath10k_pci_exit+0x10/0x6ee [ath10k_pci]
[ 4037.221218] SyS_delete_module+0x1e1/0x2a0
[ 4037.221222] do_syscall_64+0x64/0x140
[ 4037.221225] entry_SYSCALL64_slow_path+0x25/0x25
Thanks,
Ben
--
Ben Greear <greearb@xxxxxxxxxxxxxxx>
Candela Technologies Inc http://www.candelatech.com