On Mon, Feb 6, 2017 at 9:30 AM, Gabriel C <nix.or.die@xxxxxxxxx> wrote: > > Somewhat late , however I didn't tested 4.9.6 but jumped from 4.9.5 to 4.9.7 > and found out by box won't boot anymore. > > It hangs early and freeze with a lot RCU warnings. > Since I cannot setup a netconsole right now I cannot post the errors , > really sorry. > > ( but I could make a picture if needed ) > > I bisected it down to : > >> Ruslan Ruslichenko (1): >> x86/ioapic: Restore IO-APIC irq_chip retrigger callback Ok, it's 020eb3daaba2 ("x86/ioapic: Restore IO-APIC irq_chip retrigger callback") in mainline. > Reverting this one fixes the problem for me.. Since that came in rather late, I suspect we'll have to revert for now. The thing it fixes has been around for almost two years, so it can't be as serious a problem as the fix itself ended up being. Thomas? That said, it also strikes me that the implicated irq_chip_retrigger_hierarchy() function looks really very suspicious indeed. Most of the other users don't seem to traverse the parent all the way until they find something. They just do the operation in the parent, and if the parent needs it, it might then do it in _its_ parent and so on. And the compiler is able to turn the parent call into a tail call so it doesn't cause a stack use explosion even if the parenthood chains end up being pretty deep. So I'm wondering if that for-loop triggers a stack overflow on your setup somehow, just because that irq_retrigger() call is now truly recursive, and hasn't been turned into tail-calls. But for now, I'd be inclined to just revert it unless somebody has a "Duh!" moment and can tell me what's wrong with that commit with an obvious fix. Comments? Linus -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html