Re: Page allocator order-0 optimizations merged

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 27 Mar 2017 07:15:18 -0700
Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:

> On Mon, Mar 27, 2017 at 02:39:47PM +0200, Jesper Dangaard Brouer wrote:
> >  
> > +static __always_inline int in_irq_or_nmi(void)
> > +{
> > +	return in_irq() || in_nmi();
> > +// XXX: hoping compiler will optimize this (todo verify) into:
> > +// #define in_irq_or_nmi()	(preempt_count() & (HARDIRQ_MASK | NMI_MASK))
> > +
> > +	/* compiler was smart enough to only read __preempt_count once
> > +	 * but added two branches
> > +asm code:
> > + │       mov    __preempt_count,%eax
> > + │       test   $0xf0000,%eax    // HARDIRQ_MASK: 0x000f0000
> > + │    ┌──jne    2a
> > + │    │  test   $0x100000,%eax   // NMI_MASK:     0x00100000
> > + │    │↓ je     3f
> > + │ 2a:└─→mov    %rbx,%rdi
> > +
> > +	 */
> > +}  
> 
> To be fair, you told the compiler to do that with your use of fancy-pants ||
> instead of optimisable |.  Try this instead:

Thanks you! -- good point! :-)

> static __always_inline int in_irq_or_nmi(void)
> {
> 	return in_irq() | in_nmi();
> }
> 
> 0000000000001770 <test_fn>:
>     1770:       65 8b 05 00 00 00 00    mov    %gs:0x0(%rip),%eax        # 1777 <test_fn+0x7>
>                         1773: R_X86_64_PC32     __preempt_count-0x4
> #define in_nmi()                (preempt_count() & NMI_MASK)
> #define in_task()               (!(preempt_count() & \
>                                    (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)))
> static __always_inline int in_irq_or_nmi(void)
> {
>         return in_irq() | in_nmi();
>     1777:       25 00 00 1f 00          and    $0x1f0000,%eax
> }
>     177c:       c3                      retq   
>     177d:       0f 1f 00                nopl   (%rax)

And I also verified it worked:

  0.63 │       mov    __preempt_count,%eax
       │     free_hot_cold_page():
  1.25 │       test   $0x1f0000,%eax
       │     ↓ jne    1e4

And this simplification also made the compiler change this into a
unlikely branch, which is a micro-optimization (that I will leave up to
the compiler).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]
  Powered by Linux