On 09/28/2016 11:43 PM, Peter Zijlstra wrote: > On Wed, Sep 28, 2016 at 06:20:29PM -0700, Vineet Gupta wrote: >> On 09/28/2016 03:26 PM, Andy Lutomirski wrote: > > > user irq nmi > > | > | > `-----> . > | > | > | > `-----> . > | > | > . <-----' > . <-----' > | > | > > So what Andy is saying is that NMI context never sets TIF_NEED_RESCHED, Can we we be absolutely sure about that. A perf intr, vmalloc based mmap can go thru various hoops and. Is it not possible that it hits a reschedule, setting TIF_NEED_RESCHED > this means that return from NMI never needs to check for preemption > etc.. I don't think this implies from prev one. In my example, timer interrupt triggers a TIF_NEED_RESCHED and irq_exit -> __do_softirq() it hits the perf intr > Now your return from IRQ obviously should, the normal way. If the IRQ > return gets interrupted by the NMI nothing special should occur. The > return from NMI should simply resume the return from IRQ. > > So I'm a little confused by your timer interrupt example, it _should_ do > the preemption, the nested interrupt (NMI) will return to the regular > interrupt which should resume its normal return preemption or not. So lets first see how a single priority intr works on ARC (maybe on other arches as well). 1. task t1 enters kernel syscall (Trap Exception on ARC), handler drops down to pure kernel model and proceeds into syscall handler. 2. while in handler, some intr is taken, which causes a reschedule to task t2. 3. t2's control flow returns (say it was in syscall when originally scheduled-out). It needs to return to user mode but cpu needs to return from active interrupt. So we return to user mode, "riding" the intr return path. Means intr in step #2 returns to a different PC and execution mode (user vs. kernel etc). Now the same scheme doesn't work out of the box when u have intr and nmi. We have to actively ensure that nmi doesn't lead to a __schedule() sans user code. And this is done by bumping preempt_count(NMI_OFFSET) in entry of nmi handler.