.p2align

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

The other day I wrote a few routines in assembler (using WIN64 calling
convention). It was something more like writing the code in C, compiling
it with gcc, then doing `objdump -D a.out | less`, taking the code and
making the necessary changes (save/restore %rdi, %rsi upon enter/leave).
All was great. Still, in my search for speed I noticed that gcc generated
a lot of suff like:

  ...
  .data 16
  .data 16
  nop
  nop
  ...

which is the result of ".p2align 4,,15" (on the net, aparently this is and
I quote "like a "turbo" switch on some benchmarks"). I said to myself: "good
to know" and did the necessary changes in my "*.S" files.
Indeed, what was before a nasty unaligned code, now it's nicely put at a
16byte boundary. However, to my disapointment, this did not make the code
run faster :(. "Au contraire", it made it run slower. So why is gcc using it?
Or am I missing something?

I've tested this on an AMD64 (Turion @ 2.2GHz) machine.

-- 
Mihai Donțu

[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux