Ilya Lesokhin schrieb:
Hi,
I've used your suggestion of "Specifying Registers for Local
Variables" and found out that code it generates gives better results
than my method.
Unfortunately, a recent there on the subject:
http://gcc.gnu.org/ml/gcc/2012-01/msg00305.html
got me worried, so I would like to hear what you (or anyone familiar
with the subjet) know about the status of implementation of "Local
register Variables" in the AVR backend in gcc-4.5.3 and if it's better
in newer gcc's.
Specifically, I would like to know if you believe the following code
is safe (assuming _mul16x16_32 is implemented correctly):
DWord_t mul16x16_32(uint16_t x, uint16_t y)__attribute__((always_inline));
DWord_t mul16x16_32(uint16_t x, uint16_t y)
{
register uint16_t r24_25 asm ("r24") = x;
register uint16_t r22_23 asm ("r22") = y;
register uint16_t r30_31 asm ("r30");
register uint16_t r20_21 asm ("r20");
asm (" call _mul16x16_32" "\n\t"
: "=&r" (r20_21), "=&r" (r30_31)
: "r" (r24_25), "r" (r22_23)
: "r0", "r1", "cc"
);
DWord_t Result;
Result.High = r20_21;
Result.Low = r30_31;
return Result;
}
From my understanding of the matter anf from the manual that should
word. I don't understand the eary-clobber but it won't hurt here.
But for similar code I saw ICE in fwprop, presumably because it failed
to handle the additional register mess together with incoming function
parameters depending on context of the call site.
As you have fun backporting changes then and when, you might want to
backport the widening multiply patches from PR49687. You don't need all
that fancy combinations, the quinta essentia is umulhisi3 pattern/insn
that performs the same operation as yours but without inline assembler
needed, of course, and different interface.
Your code then looked like
DWord_t mul16x16_32(uint16_t x, uint16_t y)
{
uint32_t result = (uint32_t) x * y;
...
The stuff in 4.7 works by mapping the insns to libgcc calls. As the
insns explicitly model the register footprint of the libcall, special
predicates are used to keep insn combine from propagating hard registers
into (zero_)extend insns. Otherwise, combine fails to synthesize the
pattern in some cases, i.e. if the "draft" footprint of mulsi3 overlaps
the propagated hard reg. See comment at respective predicates.
As umulhisi3 is a standard pattern and the tree optimizers handle
widening multiply, umulhisi3 should work reasonabe even without that
complexity. But I observed the general part having problems if one
operand is an integer that fits in 16 bits.
Johann
Thank,
Ilya.
On 10/24/11, Georg-Johann Lay <avr@de> wrote:
Hi,
I'm doing some modifections to the avr backend to suit my needs.
I would like to add new constrainst for individual registers to use in
inline assembly blocks and i was wondering what is the correct way of
doing this?
Not on topic: You can have a look at GCC's "Specifying Registers for
Local Variables" feature:
http://gcc.gnu.org/onlinedocs/gcc/Local-Reg-Vars.html#Local-Reg-Vars
For example, this enables you to interface non-ABI assembler functions
with C code without patching the compiler.
Moreover, introducing individual register classes will result in a zoo
of constraints for 8-bit, 16-bit, 32-bit registers.
in gcc-4.5, i've added a new class for each register to reg_class,
REG_CLASS_NAMES and REG_CLASS_CONTENTS, made a need constrainst for
each register using its class and it worked as expected.
i tried to do the same in gcc-4.7 and i got:
internal compiler error: in find_costs_and_classes, at ira-costs.c:1704.
You will have to debug the compiler proper.
Notice that hooks like HARD_REGNO_MODE_OK put additional restrictions on
registers/mode combinations.
So i was wondering whether i did the wrong thing or just forgot to
update some target hook.
Thanks,
Ilya.