On Sat, May 15 2021 at 15:09, Peter Zijlstra wrote: > On Sat, May 15, 2021 at 01:23:02AM +0200, Thomas Gleixner wrote: >> --- a/kernel/smp.c >> +++ b/kernel/smp.c >> @@ -691,7 +691,9 @@ void flush_smp_call_function_from_idle(v >> cfd_seq_store(this_cpu_ptr(&cfd_seq_local)->idle, CFD_SEQ_NOCPU, >> smp_processor_id(), CFD_SEQ_IDLE); >> local_irq_save(flags); >> + lockdep_set_softirq_raise_safe(); >> flush_smp_call_function_queue(true); >> + lockdep_clear_softirq_raise_safe(); >> if (local_softirq_pending()) >> do_softirq(); > > I think it might make more sense to raise hardirq_count() in/for > flush_smp_call_function_queue() callers that aren't already from hardirq > context. That's this site and smpcfd_dying_cpu(). > > Then we can do away with this new special case. Right. Though I just checked smpcfd_dying_cpu(). That ones does not run softirqs after flushing the function queue and it can't do that because that's in the CPU dying phase with interrupts disabled where the CPU is already half torn down. Especially as softirq processing enables interrupts, which might cause even more havoc. Anyway how is it safe to run arbitrary functions there after the CPU removed itself from the online mask? That's daft to put it mildly. Thanks, tglx