Re: prevent GCC from re-arranging two emit_insn()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/11/19 2:43 PM, William Tambe wrote:
> Wouldn't the blockage insn prevent the compiler from re-arranging
> other following instructions ?
> 
> In fact, the two emit_insn() need to be seen as one instruction with
> the compiler free to re-arrange other instructions around it.
> 
> This is needed to implement "mulsidi3" where the hardware use a second
> instruction to return the high-part of a multiplication result, and
> that second instruction needs to be issued immediately after the first
> instruction which return the low-part of the multiplication. Currently
> this is what is used:
> 
> (define_expand "mulsidi3"
>  [(set (match_operand:DI 0 "register_operand" "=r")
>   (mult:DI
>    (sign_extend:DI (match_operand:SI 1 "register_operand" "0"))
>    (sign_extend:DI (match_operand:SI 2 "register_operand" "r"))))]
>  ""
>  {
>    rtx lo = gen_reg_rtx (SImode);
>    rtx hi = gen_reg_rtx (SImode);
>    emit_insn (gen_mulsi3 (lo, operands[1], operands[2]));
>    emit_insn (gen_mulsi3_highpart (hi, operands[1], operands[2]));
>    emit_move_insn (gen_lowpart (SImode, operands[0]), lo);
>    emit_move_insn (gen_highpart (SImode, operands[0]), hi);
>    DONE;
> })
> 
> What is needed is for the two emit_insn() to be seen as one
> instruction with the compiler free to re-arrange other instructions
> around it.
> 
> Is there a better way than the blockage insn to achieve the above ?
In general if you find yourself needing to force two insns to be
consecutive like you're doing, then you're probably doing something
wrong.  There are exceptions like instruction fusion for scheduling
purposes, but that's an optimization, not a correctness issue.

What I see above looks like a pretty standard widening multiply.  Look
at how other ports handle this stuff.

If you actually need two machine instructions here, then I'd have a
define_insn which emits both in its output template.  Just describe it
fully in the RTL.  While it's *generally* best to have define_insns emit
a single instruction, there are exceptions in almost every port.

What's definitely not clear to me with your expander above is why you
have to create two scratch registers then move those into the final
destination.  That seems odd.  If possible generate your outputs
directly into the upper and lower halves of operands[0].   This is
pretty easy if you're on a 32bit target since your DImode output will be
an aligned hard register pair once register allocation is complete.

Jeff




[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux