On Wed, 2023-06-21 at 12:42 -0700, Jakub Kicinski wrote: > Hi Johannes, > > Doesn't seem like netlink_set_err() wants to be called from just any > context. Should we convert nl_table_lock to alwasy be _bh ? So as I was writing this Eric responded too :) It does seem _bh wouldn't be sufficient then, the lockdep report below also mentions IRQ disabling. I'm not entirely sure this is needed, and I'm also not sure we really need to fix it immediately, it's a very old bug and one that's going to be very difficult to actually hit a deadlock on in practice. The "CPU1" part of the report is basically almost never happening. This is why syzbot couldn't reproduce it, this code will always execute: spin_lock_irqsave(&local->queue_stop_reason_lock, flags); for (i = 0; i < IEEE80211_MAX_QUEUES; i++) { skb_queue_walk_safe(&local->pending[i], skb, tmp) { struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb); if (info->control.vif == &sdata->vif) { __skb_unlink(skb, &local->pending[i]); ieee80211_free_txskb(&local->hw, skb); } } } spin_unlock_irqrestore(&local->queue_stop_reason_lock, flags); However, pretty much *all* of the time there will be no SKBs on this pending list that have a report to send out to userspace in ieee80211_free_txskb -> ieee80211_report_used_skb: ... } else if (info->ack_frame_id) { ieee80211_report_ack_skb(local, skb, acked, dropped, ack_hwtstamp); and really, ack_frame_id is rarely set, and in addition it's pretty unlikely that a frame would still be on the queue when the interface is set down. That's also not a very common operation (it's not like you try to do that many times per second), so ... Another approach might be to just disentangle that loop I pasted above out from the queue_stop_reason_lock, we can move the SKBs to a separate list before freeing them. That said, it does stand to reason that if nlmsg_multicast() or netlink_broadcast() can be called with a gfp_t argument then it should be possible to call it with IRQs disabled or under a lock such as this example, so perhaps Eric's patch really is the right thing to do here, to avoid this potential pitfall in the future. johannes