On Fri, Nov 23 2018, Segher Boessenkool wrote: > On Fri, Nov 23, 2018 at 09:01:56PM +0100, Helmut Eller wrote: >> 2.) %rsp is adjusted before calling __addvdi3. Why is that needed? > > To keep the stack aligned (to 16 bytes). Hmm... I guess %rsp is initially 16 byte aligned and we need to add 8 because the call instruction also pushes the return address. >> 3.) Obviously __addvdi3 is not implemented as sibling-call even though >> -O2 should enable that. > > It calls via the PLT, do sibling calls via the PLT work in your ABI? I think so. At least extern long f (long x, long y); long g (long x, long y) { return f(x, y); } is compiled to: g: jmp f@PLT >> Where should I start, if I wanted to teach GCC how to produce the same >> code for foo as for bar? Would it be enough to add a pattern to >> i386.md? There is already a pattern for "addv<mode>4", but apparently >> it's not used in this case. > > As Marc says, -ftrapv is probably not the way to go. > > Adding an addv<mode>3 to the i386 backend might help. Maybe I'll try it; out of interest. > You do *not* want exactly the same code, btw; addv3 calls abort on > overflow, that's not the same as executing an ud2 instruction. Hmm, yes. I would prefer ud2, as that is more compact. Helmut