On Fri, Jan 29, 2021 at 5:36 PM Jakub Kicinski <kuba@xxxxxxxxxx> wrote: > > I'm still struggling to wrap my head around this. > > Did you test your code with lockdep enabled? Which Qdisc are you using? > You're queuing the frames back to the interface they came from - won't > that cause locking issues? Hmm... Thanks for bringing this to my attention. I indeed find issues when the "noqueue" qdisc is used. When using a qdisc other than "noqueue", when sending an skb: "__dev_queue_xmit" will call "__dev_xmit_skb"; "__dev_xmit_skb" will call "qdisc_run_begin" to mark the beginning of a qdisc run, and if the qdisc is already running, "qdisc_run_begin" will fail, then "__dev_xmit_skb" will just enqueue this skb without starting qdisc. There is no problem. When using "noqueue" as the qdisc, when sending an skb: "__dev_queue_xmit" will try to send this skb directly. Before it does that, it will first check "txq->xmit_lock_owner" and will find that the current cpu already owns the xmit lock, it will then print a warning message "Dead loop on virtual device ..." and drop the skb. A solution can be queuing the outgoing L2 frames in this driver first, and then using a tasklet to send them to the qdisc TX queue. Thanks! I'll make changes to fix this.