Re: simple optimisation question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



在 2024-04-10 01:26, zamfofex 写道:
The flags I tested were ‘-O3’ vs. ‘-Oz’ and ‘-m32’ vs. none. (Four combinations per compiler.)

In GCC, the assembly code, although different, under ‘-m32 -Oz’ was of the same size (in bytes, after assembled) for both functions. For ‘-Oz’ withough ‘-m32’, the first one was larger.

The first piece of code involves two sign-extension operations, as in

   char* p = (char*) x;
   return *(x + (ptrdiff_t) i * 48 + (ptrdiff_t) j * 4);

and the second one involves one one, as in

   char* p = (char*) x;
   return *(x + (ptrdiff_t) (i * 48 + j * 4));


For -m32 the assembly differs a little, but as far as I can tell there is almost no difference.


Is this a missed size optimisation for x86-64? Even in the case where the assembly code is larger, the time performance difference seems unobservable. (Though I’d have imagined the the larger one would have been slower in each case.)

Maybe. My suggestion is to avoid `int` as subscripts for x86-64, as it involves unnecessary sign-extensions.

And it's not always the case that larger ones are slower. Intel CPUs recognize a lot of patterns to break dependencies (such as `xor eax, eax`, and similarly `xorps xmm0, xmm0`), which may make larger code faster.


--
Best regards,
LIU Hao

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux