On 08/18/2016 04:41 PM, Andrew Haley wrote:
On 18/08/16 14:35, Avi Kivity wrote:
On 08/18/2016 03:33 PM, Manuel López-Ibáñez wrote:
On 18/08/16 09:45, Avi Kivity wrote:
I wanted to test how restrict helps in code generation. I started
with this
example:
There are quite a few number of missed-optimizations with restrict:
https://gcc.gnu.org/PR49774
If you issue is not in that list, you may wish to open a new PR.
Here, the missed optimizations started even before I started playing
with restrict. But I'll see if I need to file the last one.
I haven't been able to duplicate this behaviour on any GCC to which I
have access.
I replicated it on Fedora 24's gcc; in fact -O2 code is worse (but -O3
is fine):
$ gcc --version
gcc (GCC) 6.1.1 20160621 (Red Hat 6.1.1-3)
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ cat restrict.cc
struct s { int a; int b; };
inline
void encode(int a, char* p) {
for (unsigned i = 0; i < sizeof(a); ++i) {
p[i] = reinterpret_cast<const char*>(&a)[i];
}
}
void f(s* x, char* p) {
encode(x->a, p + 0);
encode(x->b, p + 4);
}
$ g++ -O2 -c restrict.cc
$ objdump -SrC restrict.o
restrict.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <f(s*, char*)>:
0: 8b 07 mov (%rdi),%eax
2: 89 44 24 fc mov %eax,-0x4(%rsp)
6: 31 c0 xor %eax,%eax
8: 0f b6 54 04 fc movzbl -0x4(%rsp,%rax,1),%edx
d: 88 14 06 mov %dl,(%rsi,%rax,1)
10: 48 83 c0 01 add $0x1,%rax
14: 48 83 f8 04 cmp $0x4,%rax
18: 75 ee jne 8 <f(s*, char*)+0x8>
1a: 8b 47 04 mov 0x4(%rdi),%eax
1d: 89 c1 mov %eax,%ecx
1f: 88 46 04 mov %al,0x4(%rsi)
22: 66 c1 e9 08 shr $0x8,%cx
26: 88 4e 05 mov %cl,0x5(%rsi)
29: 89 c1 mov %eax,%ecx
2b: c1 e8 18 shr $0x18,%eax
2e: c1 e9 10 shr $0x10,%ecx
31: 88 46 07 mov %al,0x7(%rsi)
34: 88 4e 06 mov %cl,0x6(%rsi)
37: c3 retq
$ g++ -O3 -c restrict.cc
$ objdump -SrC restrict.o
restrict.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <f(s*, char*)>:
0: 8b 07 mov (%rdi),%eax
2: 89 06 mov %eax,(%rsi)
4: 8b 47 04 mov 0x4(%rdi),%eax
7: 89 46 04 mov %eax,0x4(%rsi)
a: c3 retq
Adding __restrict did not improve -O3 code generation.