Re: Missed optimization wrt. constructor clobbers?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/07/2016 12:47 AM, Marc Glisse wrote:
On Tue, 6 Dec 2016, Avi Kivity wrote:

Consider the following code


=== begin code ===

#include <experimental/optional>

using namespace std::experimental;

struct array_of_optional {
 optional<int> v[100];
};

array_of_optional
f(const array_of_optional& a) {
 return a;
}

=== end code ===


Compiling with -O3 (6.2.1), I get:


0000000000000000 <f(array_of_optional const&)>:
  0:    48 8d 8f 20 03 00 00     lea    0x320(%rdi),%rcx
  7:    48 89 f8                 mov    %rdi,%rax
  a:    48 89 fa                 mov    %rdi,%rdx
  d:    0f 1f 00                 nopl   (%rax)
 10:    c6 42 04 00              movb   $0x0,0x4(%rdx)
 14:    80 7e 04 00              cmpb   $0x0,0x4(%rsi)
18: 74 0a je 24 <f(array_of_optional const&)+0x24>
 1a:    44 8b 06                 mov    (%rsi),%r8d
 1d:    c6 42 04 01              movb   $0x1,0x4(%rdx)
 21:    44 89 02                 mov    %r8d,(%rdx)
 24:    48 83 c2 08              add    $0x8,%rdx
 28:    48 83 c6 08              add    $0x8,%rsi
 2c:    48 39 ca                 cmp    %rcx,%rdx
2f: 75 df jne 10 <f(array_of_optional const&)+0x10>
 31:    f3 c3                    repz retq

For high-level optimizations, I find it better to look at the file created by compiling with -fdump-tree-optimized.


I guess you have to read a few of them to get a feel for it.

However, because we're constructing into the return value, we're under no obligation to leave the memory untouched, so this can be optimized into a memcpy, which can be significantly faster if the optionals are randomly engaged; but gcc doesn't notice that.

Feel free to file an enhancement PR in gcc's bugzilla. The easiest is probably to handle it in libstdc++ in the copy constructor, under some conditions (trivially copy constructible and not too large). But some tools might complain about the read from uninitialized memory, even if it is safe.

I think this is too fragile. For example optional<optional<int>> would not benefit from the optimization.


Optimizers could turn

out.engaged=0
if(in.engaged)
  out.engaged=1

into out.engaged=in.engaged

but the condition would still be there, and I don't see the optimizers introducing the extra reads/writes, seems unlikely to be added.


That's a pity, because the extra writes would make it much faster.

The optimizers do feel free to write to padding holes, no? Clobbered memory could be treated as a padding hole.



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux