On Mon, 27 Mar 2017 07:15:18 -0700 Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > On Mon, Mar 27, 2017 at 02:39:47PM +0200, Jesper Dangaard Brouer wrote: > > > > +static __always_inline int in_irq_or_nmi(void) > > +{ > > + return in_irq() || in_nmi(); > > +// XXX: hoping compiler will optimize this (todo verify) into: > > +// #define in_irq_or_nmi() (preempt_count() & (HARDIRQ_MASK | NMI_MASK)) > > + > > + /* compiler was smart enough to only read __preempt_count once > > + * but added two branches > > +asm code: > > + │ mov __preempt_count,%eax > > + │ test $0xf0000,%eax // HARDIRQ_MASK: 0x000f0000 > > + │ ┌──jne 2a > > + │ │ test $0x100000,%eax // NMI_MASK: 0x00100000 > > + │ │↓ je 3f > > + │ 2a:└─→mov %rbx,%rdi > > + > > + */ > > +} > > To be fair, you told the compiler to do that with your use of fancy-pants || > instead of optimisable |. Try this instead: Thanks you! -- good point! :-) > static __always_inline int in_irq_or_nmi(void) > { > return in_irq() | in_nmi(); > } > > 0000000000001770 <test_fn>: > 1770: 65 8b 05 00 00 00 00 mov %gs:0x0(%rip),%eax # 1777 <test_fn+0x7> > 1773: R_X86_64_PC32 __preempt_count-0x4 > #define in_nmi() (preempt_count() & NMI_MASK) > #define in_task() (!(preempt_count() & \ > (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET))) > static __always_inline int in_irq_or_nmi(void) > { > return in_irq() | in_nmi(); > 1777: 25 00 00 1f 00 and $0x1f0000,%eax > } > 177c: c3 retq > 177d: 0f 1f 00 nopl (%rax) And I also verified it worked: 0.63 │ mov __preempt_count,%eax │ free_hot_cold_page(): 1.25 │ test $0x1f0000,%eax │ ↓ jne 1e4 And this simplification also made the compiler change this into a unlikely branch, which is a micro-optimization (that I will leave up to the compiler). -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href