Re: Surprisingly bad code generated near char*

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/18/2016 04:41 PM, Andrew Haley wrote:
On 18/08/16 14:35, Avi Kivity wrote:
On 08/18/2016 03:33 PM, Manuel López-Ibáñez wrote:
On 18/08/16 09:45, Avi Kivity wrote:
I wanted to test how restrict helps in code generation.  I started
with this
example:
There are quite a few number of missed-optimizations with restrict:
https://gcc.gnu.org/PR49774

If you issue is not in that list, you may wish to open a new PR.
Here, the missed optimizations started even before I started playing
with restrict.  But I'll see if I need to file the last one.
I haven't been able to duplicate this behaviour on any GCC to which I
have access.


I replicated it on Fedora 24's gcc; in fact -O2 code is worse (but -O3 is fine):

$ gcc --version
gcc (GCC) 6.1.1 20160621 (Red Hat 6.1.1-3)
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ cat restrict.cc
  struct s { int a; int b; };

  inline
  void encode(int a, char* p) {
    for (unsigned i = 0; i < sizeof(a); ++i) {
      p[i] = reinterpret_cast<const char*>(&a)[i];
    }
  }

  void f(s* x, char* p) {
    encode(x->a, p + 0);
    encode(x->b, p + 4);
  }

$ g++ -O2 -c restrict.cc

$ objdump -SrC restrict.o

restrict.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <f(s*, char*)>:
   0:    8b 07                    mov    (%rdi),%eax
   2:    89 44 24 fc              mov    %eax,-0x4(%rsp)
   6:    31 c0                    xor    %eax,%eax
   8:    0f b6 54 04 fc           movzbl -0x4(%rsp,%rax,1),%edx
   d:    88 14 06                 mov    %dl,(%rsi,%rax,1)
  10:    48 83 c0 01              add    $0x1,%rax
  14:    48 83 f8 04              cmp    $0x4,%rax
  18:    75 ee                    jne    8 <f(s*, char*)+0x8>
  1a:    8b 47 04                 mov    0x4(%rdi),%eax
  1d:    89 c1                    mov    %eax,%ecx
  1f:    88 46 04                 mov    %al,0x4(%rsi)
  22:    66 c1 e9 08              shr    $0x8,%cx
  26:    88 4e 05                 mov    %cl,0x5(%rsi)
  29:    89 c1                    mov    %eax,%ecx
  2b:    c1 e8 18                 shr    $0x18,%eax
  2e:    c1 e9 10                 shr    $0x10,%ecx
  31:    88 46 07                 mov    %al,0x7(%rsi)
  34:    88 4e 06                 mov    %cl,0x6(%rsi)
  37:    c3                       retq


$ g++ -O3 -c restrict.cc

$ objdump -SrC restrict.o

restrict.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <f(s*, char*)>:
   0:    8b 07                    mov    (%rdi),%eax
   2:    89 06                    mov    %eax,(%rsi)
   4:    8b 47 04                 mov    0x4(%rdi),%eax
   7:    89 46 04                 mov    %eax,0x4(%rsi)
   a:    c3                       retq


Adding __restrict did not improve -O3 code generation.



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux