Peter Kuschnerus schrieb:
Georg-Johann Lay schrieb:
Try zero_extend or sign_extend patterns that also allow memory
operands.
I assumed already that this RTL-template is not ok.
What's wrong with extend patterns?
My Problem is only with registers. For all instructions that write a
result into a register smaller than the width of the register, there
are two variants of that instruction. One that does zero-extend and
one that does sign-extend to the full width of the register. I must
choose one of it.
The question is then, how to decide which one is right. The prefered
is the same signedness as the type is.
While expanding, you have to use the known "standard insn names"
patterns, others will be not be taken into account during tree -> RTL
lowering.
If your machine supports addqi3
(set (reg:QI A)
(plus:QI (reg:QI B)
(reg:QI C)))
then is does not matter what happens to the upper bits of A. Notice
that such add instruction must only use the lower 8 bits of B and C, you
must not assume that they are properly promoted to a wider mode.
If your machine does not support such addition, which is not uncommon on
32-bit systems, don't supply it. The compiler will promote the stuff to
SImode and use the right insns.
If there are instructions like
(set (reg:SI A)
(sign_extend:SI (plus:QI (reg:QI B)
(reg:QI C))))
and ditto for zero_extend, then there is no standard name. You can
provide a pattern for the combiner and it will cook up the instruction
if it manages to find such a combination. A C code for this is
int add (char a, char b)
{
return a += b;
}
Also have a look at ssum_widen. If, however, you want to support code like
int add (char a, char b)
{
return a + b;
}
then the combiner pattern would read instead
(set (reg:SI A)
(plus:SI (sign_extend:SI (reg:QI B))
(sign_extend:SI (reg:QI C))))
If you only have instructions available that perform 8-bit addition with
some extensions, pick the one that is the most efficient to implement
addqi and let the combiner consume extends. For an example, see
avr.md:*addhi3_zero_extend.
Notice that you can also consume the extension in the input predicate(s)
of addsi3. I cannot say what works better / more efficient.
I expected to get this information by using
SUBREG_PROMOTED_UNSIGNED_P in the output template.
May be that this information is not available there. Or I did it
wrong.
As I already said, in strict RTL and thus in the template there are no
subregs any more. Even in non-strict RTL were are subregs, I'd not rely
on SUBREG_PROMOTED_UNSIGNED_P.
Johann