On 12/07/2016 12:47 AM, Marc Glisse wrote:
On Tue, 6 Dec 2016, Avi Kivity wrote:
Consider the following code
=== begin code ===
#include <experimental/optional>
using namespace std::experimental;
struct array_of_optional {
optional<int> v[100];
};
array_of_optional
f(const array_of_optional& a) {
return a;
}
=== end code ===
Compiling with -O3 (6.2.1), I get:
0000000000000000 <f(array_of_optional const&)>:
0: 48 8d 8f 20 03 00 00 lea 0x320(%rdi),%rcx
7: 48 89 f8 mov %rdi,%rax
a: 48 89 fa mov %rdi,%rdx
d: 0f 1f 00 nopl (%rax)
10: c6 42 04 00 movb $0x0,0x4(%rdx)
14: 80 7e 04 00 cmpb $0x0,0x4(%rsi)
18: 74 0a je 24 <f(array_of_optional
const&)+0x24>
1a: 44 8b 06 mov (%rsi),%r8d
1d: c6 42 04 01 movb $0x1,0x4(%rdx)
21: 44 89 02 mov %r8d,(%rdx)
24: 48 83 c2 08 add $0x8,%rdx
28: 48 83 c6 08 add $0x8,%rsi
2c: 48 39 ca cmp %rcx,%rdx
2f: 75 df jne 10 <f(array_of_optional
const&)+0x10>
31: f3 c3 repz retq
For high-level optimizations, I find it better to look at the file
created by compiling with -fdump-tree-optimized.
I guess you have to read a few of them to get a feel for it.
However, because we're constructing into the return value, we're
under no obligation to leave the memory untouched, so this can be
optimized into a memcpy, which can be significantly faster if the
optionals are randomly engaged; but gcc doesn't notice that.
Feel free to file an enhancement PR in gcc's bugzilla. The easiest is
probably to handle it in libstdc++ in the copy constructor, under some
conditions (trivially copy constructible and not too large). But some
tools might complain about the read from uninitialized memory, even if
it is safe.
I think this is too fragile. For example optional<optional<int>> would
not benefit from the optimization.
Optimizers could turn
out.engaged=0
if(in.engaged)
out.engaged=1
into out.engaged=in.engaged
but the condition would still be there, and I don't see the optimizers
introducing the extra reads/writes, seems unlikely to be added.
That's a pity, because the extra writes would make it much faster.
The optimizers do feel free to write to padding holes, no? Clobbered
memory could be treated as a padding hole.