On Mon, 2024-10-28 at 10:48 -0700, Ben Greear wrote: > > We see indication that the iwlwifi txpath can busy-spin, > causing soft-lockup (and, only indication at this point, possibly > issue is elsewhere somehow). TX path I'm not aware of any issues, but we did have this recently: https://bugzilla.kernel.org/show_bug.cgi?id=219375 > But, I also wanted to check on expected behaviour. At the bottom is a double > loop. The inner will break out if the queues are full and for some other reasons, > but the outside loop is spinning on a different atomic counter. The question is: > If the inner loop breaks out, at least for queue full reasons, should it then > immediately break out of the outer while loop as well? It shouldn't matter, but off the top of my head I'd say it's valid to break out entirely since the "queue no longer full" indication will restart it. In fact, it seems that'd really make more sense than the "sofar" thing you added. Not much value in retrying if the queue is full anyway? > And, from what I can tell, it would be possible for other transmitters to hit this > path, Not really? It should only get here from two places: userspace (serialized, so you're not going to get to this point with two threads from there), and the "queue no longer full" logic I mentioned above. Oh, maybe technically a third at the beginning after allocating a new queue. > Based on the description of the 3 tx_request states, I am also not sure that > this would not hang the tx path in case where inner loop bails out due to > tx queue full, leaving packets queued. If no other packets are ever transmitted, > is there anything that would re-kick the xmit path? If the queue becomes "not full", then yes, that kicks it again. I guess I could sort of see a scenario where - queues got full - queues got not full - we kick this logic via "queue not full" - while this is running, userspace TX permanently bumps tx_request from 1 to 2, this decrements it again, etc. What thread is the soft lockup in that you see? johannes