Re: Using bt,bts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 26, 2012 at 04:20:52PM -0700, Ian Lance Taylor wrote:
> On Wed, Sep 26, 2012 at 10:34 AM, Ondřej Bílka <neleai@xxxxxxxxx> wrote:
> 
> > is there a reason why for example
> > x=x|(1<<11);
> > is not expanded into
> > bts rax,11
> > ?
> 
> The bts instruction is never faster than the corresponding or
> instruction.  There's no reason to use it when setting a bit in the
> low 32 bits.
> 
> Ian
Following benchmarks tells otherwise. On ivy bridge bts variant is twice
faster than doing or.

I used

 for(i=0;i<1000000;i++)
    x=x|(1<<i);

implemented as 

.globl main
  .type main, @function
main:
.LFB0:
  .cfi_startproc
  xorl  %eax, %eax
  xorl  %ecx, %ecx
  movl  $1, %edx
  .p2align 4,,10
  .p2align 3
.L2:
  bts %ecx, %edx
  addl  $1, %ecx
  cmpl  $100000000, %ecx
  jne .L2
  rep
  ret
.cfi_endproc

and

.globl main
  .type main, @function
main:
.LFB0:
  .cfi_startproc
  xorl  %eax, %eax
  xorl  %ecx, %ecx
  movl  $1, %edx
  .p2align 4,,10
  .p2align 3
.L2:
  movl  %edx, %esi
  sall  %cl, %esi
  addl  $1, %ecx
  orl %esi, %eax
  cmpl  $100000000, %ecx
  jne .L2
  rep
  ret
.cfi_endproc





[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux