> > > > On Wed, 10 Feb 2021, Song Bao Hua (Barry Song) wrote: > > > > > > On Wed, 10 Feb 2021, Song Bao Hua (Barry Song) wrote: > > > > > > > > > > On Wed, 10 Feb 2021, Song Bao Hua (Barry Song) wrote: > > > > > > > > > > > > > > There is no warning from m68k builds. That's because > > > > > > > > arch_irqs_disabled() returns true when the IPL is non-zero. > > > > > > > > > > > > > > So for m68k, the case is arch_irqs_disabled() is true, but > > > > > > > interrupts can still come? > > > > > > > > > > > > > > Then it seems it is very confusing. If prioritized interrupts > > > > > > > can still come while arch_irqs_disabled() is true, > > > > > > > > > > > > Yes, on m68k CPUs, an IRQ having a priority level higher than the > > > > > > present priority mask will get serviced. > > > > > > > > > > > > Non-Maskable Interrupt (NMI) is not subject to this rule and gets > > > > > > serviced regardless. > > > > > > > > > > > > > how could spin_lock_irqsave() block the prioritized interrupts? > > > > > > > > > > > > It raises the the mask level to 7. Again, please see > > > > > > arch/m68k/include/asm/irqflags.h > > > > > > > > > > Hi Finn, > > > > > Thanks for your explanation again. > > > > > > > > > > TBH, that is why m68k is so confusing. irqs_disabled() on m68k > > > > > should just reflect the status of all interrupts have been disabled > > > > > except NMI. > > > > > > > > > > irqs_disabled() should be consistent with the calling of APIs such > > > > > as local_irq_disable, local_irq_save, spin_lock_irqsave etc. > > > > > > > > > > > > > When irqs_disabled() returns true, we cannot infer that > > > > arch_local_irq_disable() was called. But I have not yet found driver > > > > code or core kernel code attempting that inference. > > > > > > > > > > > > > > > > > Isn't arch_irqs_disabled() a status reflection of irq disable > > > > > > > API? > > > > > > > > > > > > > > > > > > > Why not? > > > > > > > > > > If so, arch_irqs_disabled() should mean all interrupts have been > > > > > masked except NMI as NMI is unmaskable. > > > > > > > > > > > > > Can you support that claim with a reference to core kernel code or > > > > documentation? (If some arch code agrees with you, that's neither here > > > > nor there.) > > > > > > I think those links I share you have supported this. Just you don't > > > believe :-) > > > > > > > Your links show that the distinction between fast and slow handlers was > > removed. Your links don't support your claim that "arch_irqs_disabled() > > should mean all interrupts have been masked". Where is the code that makes > > that inference? Where is the documentation that supports your claim? > > (1) > https://lwn.net/Articles/380931/ > Looking at all these worries, one might well wonder if a system which *disabled > interrupts for all handlers* would function well at all. So it is interesting > to note one thing: any system which has the lockdep locking checker enabled > has been running all handlers that way for some years now. Many developers > and testers run lockdep-enabled kernels, and they are available for some of > the more adventurous distributions (Rawhide, for example) as well. So we > have quite a bit of test coverage for this mode of operation already. > > (2) > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/ > ?id=b738a50a > > "We run all handlers *with interrupts disabled* and expect them not to > enable them. Warn when we catch one who does." > > (3) > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/ > ?id=e58aa3d2d0cc > genirq: Run irq handlers *with interrupts disabled* > > Running interrupt handlers with interrupts enabled can cause stack > overflows. That has been observed with multiqueue NICs delivering all > their interrupts to a single core. We might band aid that somehow by > checking the interrupt stacks, but the real safe fix is to *run the irq > handlers with interrupts disabled*. > > > All these documents say we are running irq handler with interrupts > disabled. but it seems you think high-prio interrupts don't belong > to "interrupts" in those documents :-) > > that is why we can't get agreement. I think "interrupts" mean > all except NMI in these documents, but you insist high-prio IRQ > is an exception. > > > > > > > > > > > > > > > > > > > Are all interrupts (including NMI) masked whenever > > > > > > arch_irqs_disabled() returns true on your platforms? > > > > > > > > > > On my platform, once irqs_disabled() is true, all interrupts are > > > > > masked except NMI. NMI just ignore spin_lock_irqsave or > > > > > local_irq_disable. > > > > > > > > > > On ARM64, we also have high-priority interrupts, but they are > > > > > running as PESUDO_NMI: > > > > > https://lwn.net/Articles/755906/ > > > > > > > > > > > > > A glance at the ARM GIC specification suggests that your hardware > > > > works much like 68000 hardware. > > > > > > > > When enabled, a CPU interface takes the highest priority pending > > > > interrupt for its connected processor and determines whether the > > > > interrupt has sufficient priority for it to signal the interrupt > > > > request to the processor. [...] > > > > > > > > When the processor acknowledges the interrupt at the CPU interface, > > > > the Distributor changes the status of the interrupt from pending to > > > > either active, or active and pending. At this point the CPU > > > > interface can signal another interrupt to the processor, to preempt > > > > interrupts that are active on the processor. If there is no pending > > > > interrupt with sufficient priority for signaling to the processor, > > > > the interface deasserts the interrupt request signal to the > > > > processor. > > > > > > > > https://developer.arm.com/documentation/ihi0048/b/ > > > > > > > > Have you considered that Linux/arm might benefit if it could fully > > > > exploit hardware features already available, such as the interrupt > > > > priority masking feature in the GIC in existing arm systems? > > > > > > I guess no:-) there are only two levels: IRQ and NMI. Injecting a > > > high-prio IRQ level between them makes no sense. > > > > > > To me, arm64's design is quite clear and has no any confusion. > > > > > > > Are you saying that the ARM64 hardware design is confusing because it > > implements a priority mask, and that's why you had to simplify it with a > > pseudo-nmi scheme in software? > > No, I was not saying this. I think both m68k and arm64 have good hardware > design. Just Linux's implementation is running irq-handlers with interrupts > disabled. So ARM64's pseudo-nmi is adapted to Linux better. > > > > > > > > > > > > On m68k, it seems you mean: > > > > > irq_disabled() is true, but high-priority interrupts can still come; > > > > > local_irq_disable() can disable high-priority interrupts, and at that > > > > > time, irq_disabled() is also true. > > > > > > > > > > TBH, this is wrong and confusing on m68k. > > > > > > > > > > > > > Like you, I was surprised when I learned about it. But that doesn't mean > > > > it's wrong. The fact that it works should tell you something. > > > > > > > > > > The fact is that m68k lets arch_irq_disabled() return true to pretend > > > all IRQs are disabled while high-priority IRQ is still open, thus "pass" > > > all sanitizing check in genirq and kernel core. > > > > > > > The fact is that m68k has arch_irq_disabled() return false when all IRQs > > are enabled. So there is no bug. > > But it has arch_irq_disabled() return true while some interrupts(not NMI) > are still open. > > > > > > > Things could always be made simpler. But discarding features isn't > > > > necessarily an improvement. > > > > > > This feature could be used by calling local_irq_enable_in_hardirq() in > > > those IRQ handlers who hope high-priority interrupts to preempt it for a > > > while. > > > > > > > So, if one handler is sensitive to interrupt latency, all other handlers > > should be modified? I don't think that's workable. > > I think we just enable preempt_rt or force threaded_irq, and then improve > the priority of the irq thread who is sensitive to latency. No need to > touch all threads. > > I also understand your point, we let one high-prio interrupt preempt > low priority interrupt, then we don't need to change the whole system. > But I think Linux prefers the method of threaded_irq or preempt_rt > for this kind of problems. > > > > > In anycase, what you're describing is a completely different nested > > interrupt scheme that would defeat the priority level mechanism that the > > hardware provides us with. > > Yes. Indeed. > > > > > > It shouldn't hide somewhere and make confusion. > > > > > > > The problem is hiding so well that no-one has found it! I say it doesn't > > exist. > > Long long ago(before 2.6.38), we had a kernel supporting IRQF_DISABLED and > nested interrupts were widely supported, but system also ran well in most > cases. That means nested interrupts don't really matter in most cases. > That is why m68k is also running well even though it is still nesting. > > > > > > On the other hand, those who care about realtime should use threaded IRQ > > > and let IRQ threads preempt each other. > > > > > > > Yes. And those threads also have priority levels. > > Finn, I am not a m68k guy, would you help check if this could activate a > warning on m68k. maybe we can discuss this question in genirq maillist from > this warning if you are happy. Thanks very much. > > diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h > index 7c9d6a2d7e90..b8ca27555c76 100644 > --- a/include/linux/hardirq.h > +++ b/include/linux/hardirq.h > @@ -32,6 +32,7 @@ static __always_inline void rcu_irq_enter_check_tick(void) > */ > #define __irq_enter() \ > do { \ > + WARN_ONCE(in_hardirq() && irqs_disabled(), "nested > interrupts\n"); \ > preempt_count_add(HARDIRQ_OFFSET); \ > lockdep_hardirq_enter(); \ > account_hardirq_enter(current); \ > @@ -44,6 +45,7 @@ static __always_inline void rcu_irq_enter_check_tick(void) > */ > #define __irq_enter_raw() \ > do { \ > + WARN_ONCE(in_hardirq() && irqs_disabled(), "nested > interrupts\n"); \ > preempt_count_add(HARDIRQ_OFFSET); \ > lockdep_hardirq_enter(); \ > } while (0) > I am not a m68k guy, here I can show you some code in cortex-m, which supports full interrupt priority preemption in hardware, but its arch code just disables it to satisfy Linux: arch/arm/kernel/entry-header.S .macro v7m_exception_entry ... @ Linux expects to have irqs off. Do it here before taking stack space cpsid i arch/arm/kernel/entry-v7m.S __irq_entry: v7m_exception_entry @ @ Invoke the IRQ handler @ ... @ routine called with r0 = irq number, r1 = struct pt_regs * bl nvic_handle_irq pop {lr} Another example in arch/arc/kernel/entry-arcv2.S: ENTRY(handle_interrupt) INTERRUPT_PROLOGUE # irq control APIs local_irq_save/restore/disable/enable fiddle with # global interrupt enable bits in STATUS32 (.IE for 1 prio, .E[] for 2 prio) # However a taken interrupt doesn't clear these bits. Thus irqs_disabled() # query in hard ISR path would return false (since .IE is set) which would # trips genirq interrupt handling asserts. # # So do a "soft" disable of interrutps here. # # Note this disable is only for consistent book-keeping as further interrupts # will be disabled anyways even w/o this. Hardware tracks active interrupts # seperately in AUX_IRQ_ACT.active and will not take new interrupts # unless this one returns (or higher prio becomes pending in 2-prio scheme) IRQ_DISABLE ; icause is banked: one per priority level ; so a higher prio interrupt taken here won't clobber prev prio icause lr r0, [ICAUSE] mov blink, ret_from_exception b.d arch_do_IRQ mov r1, sp Actually in m68k, I also saw its IRQ entry disabled interrupts by ' move #0x2700,%sr /* disable intrs */' arch/m68k/include/asm/entry.h: .macro SAVE_ALL_SYS move #0x2700,%sr /* disable intrs */ btst #5,%sp@(2) /* from user? */ bnes 6f /* no, skip */ movel %sp,sw_usp /* save user sp */ ... .macro SAVE_ALL_INT SAVE_ALL_SYS moveq #-1,%d0 /* not system call entry */ movel %d0,%sp@(PT_OFF_ORIG_D0) .endm arch/m68k/kernel/entry.S: /* This is the main interrupt handler for autovector interrupts */ ENTRY(auto_inthandler) SAVE_ALL_INT GET_CURRENT(%d0) | put exception # in d0 bfextu %sp@(PT_OFF_FORMATVEC){#4,#10},%d0 subw #VEC_SPUR,%d0 movel %sp,%sp@- movel %d0,%sp@- | put vector # on stack auto_irqhandler_fixup = . + 2 jsr do_IRQ | process the IRQ addql #8,%sp | pop parameters off stack jra ret_from_exception So my question is that " move #0x2700,%sr" is actually disabling all interrupts? And is m68k actually running irq handlers with interrupts disabled? Best Regards Barry