This bug became exposed by the new behaviour of netif_rx(). When BHs are enabled, hardirqs are enabled too (for a moment) which causes the Rx interrupt to be handled before a request is enqueued. If there are enough such unhandled packets, the EP gets locked up. > > Replacing netif_rx() with __netif_rx() apparently fixes this part, as it > > does not lead to any change of hardirq state. There is still one problem > > though: rx_complete() is usually called from the interrupt context, except > > when the network interface is brought up. > > __netif_rx() has an assert which should complain if you use > __netif_rx(). Further in this case you pass the skb to backlog but never > kick it for processing. Which means it is delayed until a random > interrupt notices and processes it. Now I see that it was a bad idea. I just found this using git bisect. > > I think one solution would be to make musb_g_giveback() use > > spin_unlock_irqrestore() and spin_lock_irqsave(), but I would need to pass > > the flags to it somehow. Also, I am not sure how that would influence other > > drivers using musb. > > I would also suggest to do this since the other solution is not safe/ > correct. There is the ->busy assignment which should cover for the most > cases. If you drop the lock without enabling interrupts then the > interrupt can't do anything to the EP and other enqueue/ dequeue > invocation is not possible if run on UP. On the other hand am335x was > used on PREEMPT_RT and it runs a UP machine into SMP so that should be > covered :) > > While looking at it, dequeue/ enqueue during complete callback looks > safe due to the busy flag. I think it is not needed now. After I have modified the interrupt handling code to clear the RXPKTRDY flag if there is no request queued and the FIFO is full, neither __local_bh_enable_ip() nor lockdep complain (tested on SMP and UP, with and without PREEMPT, on AM3358 and A64). It would probably by nicer to ensure that no MUSB interrupts are handled when a MUSB request callback is invoked from musb_g_giveback() (e.g. by disabling MUSB interrupts before releasing the lock and enabling them after acquiring it), but that could cause some side effects if the callback relied on MUSB interrupts being enabled. And since there are no warnings and everything works... I guess it is time to submit another patch then and to forget about this one. Thank you for your time! -- Hubert WiÅ?niewski <hubert.wisniewski.25632@xxxxxxxxx>