On Tue, Nov 17, 2020 at 9:59 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > > On Tue, Nov 17 2020 at 16:49, wi nk wrote: > > On Sun, Nov 15, 2020 at 8:55 PM wi nk <wink@xxxxxxxxxxx> wrote: > > So up until this point, everything is working without issues. > > Everything seems to spiral out of control a couple of seconds later > > when my system attempts to actually bring up the adapter. In most of > > the crash states I will see this: > > > > [ 31.286725] wlp85s0: send auth to ec:08:6b:27:01:ea (try 1/3) > > [ 31.390187] wlp85s0: send auth to ec:08:6b:27:01:ea (try 2/3) > > [ 31.391928] wlp85s0: authenticated > > [ 31.394196] wlp85s0: associate with ec:08:6b:27:01:ea (try 1/3) > > [ 31.396513] wlp85s0: RX AssocResp from ec:08:6b:27:01:ea > > (capab=0x411 status=0 aid=6) > > [ 31.407730] wlp85s0: associated > > [ 31.434354] IPv6: ADDRCONF(NETDEV_CHANGE): wlp85s0: link becomes ready > > > > And then either somewhere in that pile of messages, or a second or two > > after this my machine will start to stutter as I mentioned before, and > > then it either hangs, or I see this message (I'm truncating the > > timestamp): > > > > [ 35.xxxx ] sched: RT throttling activated > > As this driver uses threaded interrupts, this looks like an interrupt > storm and the interrupt thread consumes the CPU fully. The RT throttler > limits the RT runtime of it which allows other tasks make some > progress. That's what you observe as stutter. > > You can apply the hack below so the irq thread(s) run in the SCHED_OTHER > class which prevents them from monopolizing the CPU. That might make the > problem simpler to debug. > > Thanks, > > tglx > --- > diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c > index c460e0496006..8473ecacac7a 100644 > --- a/kernel/irq/manage.c > +++ b/kernel/irq/manage.c > @@ -1320,7 +1320,7 @@ setup_irq_thread(struct irqaction *new, unsigned int irq, bool secondary) > if (IS_ERR(t)) > return PTR_ERR(t); > > - sched_set_fifo(t); > + //sched_set_fifo(t); > > /* > * We keep the reference to the task struct even if I was able to apply this patch and play a little bit. Unfortunately, whatever is still going on is mostly the same. It seems this patch extends the 'stuttering' I see a little bit, but the end result is still an unresponsive machine. I didn't get tons of time to play yet, so the extra time may make it possible to finally get sysrq-c issued and get a vmcore dump. I also tried to replicate a google android patch I found to basically BUG() on the rt throttling activating (https://groups.google.com/a/chromium.org/g/chromium-os-reviews/c/NDyPucYrvRY) but that path hasn't activated for me since I booted it. I'll hopefully have a chance again this evening.