On Tue, 2018-04-24 at 17:39 +0100, Kyrill Tkachov wrote: > > Hmm, do you have any patches in your tree that affect this part of > GCC? > For me the code: > __Float64x2_t foo1(_Float64 *x) > { > __Float64x2_t a = (__Float64x2_t) { *x, *x}; > return a; > } > > generates with current trunk at -O2: > foo1: > ld1r {v0.2d}, [x0] > ret Interesting, if I have a pointer to a double and do the assigment, I do get ld1r. If I have a global variable of type double and do the assignment, I get ldr/dup. I guess that is because of the limited addressing modes supported by ld1r. With a global double value, I could do adrp/adrp to get the address of x into a register and then do an ld1r or I could do adrp/ldr to get the value of x into a register and then use dup to duplicate it. GCC chose to do the latter instead of the former but they are both 3 instructions. Steve Ellcey sellcey@xxxxxxxxxx #include <math.h> _Float64 *p1; _Float64 x = 1.35; __Float64x2_t foo1(void) { __Float64x2_t a = (__Float64x2_t) {x, x}; /* ldr/dup */ return a; } __Float64x2_t foo2(_Float64 *p2) { __Float64x2_t a = (__Float64x2_t) {*p2, *p2}; /* ldr1 */ return a; } __Float64x2_t foo3(void) { __Float64x2_t a = (__Float64x2_t) {*p1, *p1}; /* ldr1 */ return a; }