Re: Missed optimization wrt. constructor clobbers?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 6 Dec 2016, Avi Kivity wrote:

Consider the following code


=== begin code ===

#include <experimental/optional>

using namespace std::experimental;

struct array_of_optional {
 optional<int> v[100];
};

array_of_optional
f(const array_of_optional& a) {
 return a;
}

=== end code ===


Compiling with -O3 (6.2.1), I get:


0000000000000000 <f(array_of_optional const&)>:
  0:    48 8d 8f 20 03 00 00     lea    0x320(%rdi),%rcx
  7:    48 89 f8                 mov    %rdi,%rax
  a:    48 89 fa                 mov    %rdi,%rdx
  d:    0f 1f 00                 nopl   (%rax)
 10:    c6 42 04 00              movb   $0x0,0x4(%rdx)
 14:    80 7e 04 00              cmpb   $0x0,0x4(%rsi)
 18:    74 0a                    je     24 <f(array_of_optional const&)+0x24>
 1a:    44 8b 06                 mov    (%rsi),%r8d
 1d:    c6 42 04 01              movb   $0x1,0x4(%rdx)
 21:    44 89 02                 mov    %r8d,(%rdx)
 24:    48 83 c2 08              add    $0x8,%rdx
 28:    48 83 c6 08              add    $0x8,%rsi
 2c:    48 39 ca                 cmp    %rcx,%rdx
 2f:    75 df                    jne    10 <f(array_of_optional const&)+0x10>
 31:    f3 c3                    repz retq

For high-level optimizations, I find it better to look at the file created by compiling with -fdump-tree-optimized.

However, because we're constructing into the return value, we're under no obligation to leave the memory untouched, so this can be optimized into a memcpy, which can be significantly faster if the optionals are randomly engaged; but gcc doesn't notice that.

Feel free to file an enhancement PR in gcc's bugzilla. The easiest is probably to handle it in libstdc++ in the copy constructor, under some conditions (trivially copy constructible and not too large). But some tools might complain about the read from uninitialized memory, even if it is safe.

Optimizers could turn

out.engaged=0
if(in.engaged)
  out.engaged=1

into out.engaged=in.engaged

but the condition would still be there, and I don't see the optimizers introducing the extra reads/writes, seems unlikely to be added.

--
Marc Glisse



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux