RE: Behavior of Instruction combination pass

"venkat" <venkat@xxxxxxxxx> · Fri, 5 Jun 2009 19:32:00 +0530

Thank you for the inputs.

I don't have much idea about RTX costs. Can you give me a brief idea on how
to set RTX costs.

I tired the following based on MIPS port.

(-----Snip-----)
static bool
set_rtx_costs (rtx x, int code, int outer_code, int *total) {
	if((code == MULT) && optimize_size)
	{
		*total = 1; ;; <== 1 stands for default cost

		return true;
	}

	return false;
}

#undef TARGET_RTX_COSTS
#define TARGET_RTX_COSTS set_rtx_costs
(-----Snip-----)

I have updated the MULT operation cost as 1 when optimizing for size. I
believe that now I have reduced the default cost for multiplication. Hence
instead of generating shifts GCC generates MULT instructions.

Also I am able to generate optimized code for the test case with -Os option.

Please confirm.

> -----Original Message-----
> From: gcc-help-owner@xxxxxxxxxxx [mailto:gcc-help-owner@xxxxxxxxxxx] On
> Behalf Of Georg-Johann Lay
> Sent: Friday, June 05, 2009 5:58 PM
> To: venkat
> Cc: 'gnu'
> Subject: Re: Behavior of Instruction combination pass
> 
> venkat schrieb:
> > Hi,
> >
> > I am working on a GCC port for a 32 bit RISC. I have a doubt regarding
> the
> > assembly code generated by the ported compiler for the below test case.
> >
> > (----- Snip starts -----)
> > signed char g_scOperand;
> > signed char g_scResult;
> > #define CONSTANT_16_BIT     (short) 0x8000
> >
> > void vMulSignedCharGlobalWith16BitImmediateValue( void ) {
> >     g_scResult = g_scOperand * CONSTANT_16_BIT;
> >
> > }
> > (----- Snip ends -----)
> >
> > The result of multiplication is stored in a byte size variable
> 'g_scResult'.
> > As the LSB value for 'CONSTANT_16_BIT' is zero, the result of
> multiplication
> > can be optimized to 0.
> >
> > I used -O3 option. The ported compiler generates the following RTL after
> the
> > Instruction combination pass.
> >
> > (----- Snip starts -----)
> > (insn 5 2 6 2 ../test/test.c:6 (set (reg/f:SI 43)
> >         (high:SI (symbol_ref:SI ("g_scOperand") <var_decl 0xb7ce3000
> > g_scOperand>))) 27 {*load_high_of_splittable_symbol} (nil))
> >
> > (insn 6 5 8 2 ../test/test.c:6 (set (reg:SI 41)
> >         (sign_extend:SI (mem:QI (lo_sum:SI (reg/f:SI 43)
> >                     (symbol_ref:SI ("g_scOperand") <var_decl 0xb7ce3000
> > g_scOperand>)) [0 S1 A8]))) 22 {*extendqisi2} (expr_list:REG_DEAD
> (reg/f:SI
> > 43)
> >         (expr_list:REG_EQUAL (sign_extend:SI (mem/c/i:QI (symbol_ref:SI
> > ("g_scOperand") <var_decl 0xb7ce3000 g_scOperand>) [0 g_scOperand+0 S1
> A8]))
> >             (nil))))
> >
> > (insn 8 6 9 2 ../test/test.c:6 (set (reg:SI 45)
> >         (ashift:SI (reg:SI 41)
> >             (const_int 15 [0xf]))) 5 {ashlsi3} (expr_list:REG_DEAD
> (reg:SI
> > 41)
> >         (expr_list:REG_EQUAL (ashift:SI (reg:SI 41)
> >                 (const_int 15 [0xf]))
> >             (nil))))
> >
> > (insn 9 8 10 2 ../test/test.c:6 (set (reg:SI 46)
> >         (neg:SI (reg:SI 45))) 18 {negsi2} (expr_list:REG_DEAD (reg:SI
> 45)
> >         (nil)))
> >
> > (insn 10 9 0 2 ../test/test.c:6 (set (mem/c/i:QI (symbol_ref:SI
> > ("g_scResult") <var_decl 0xb7ce305c g_scResult>) [0 g_scResult+0 S1 A8])
> >         (subreg:QI (reg:SI 46) 3)) 29 {*movqi} (expr_list:REG_DEAD
> (reg:SI
> > 46)
> >         (nil)))
> >
> > (----- Snip ends -----)
> >
> > The multiplication is taking place as follows.
> > 1. Load variable 'g_scOperand' value to register.
> > 2. 0x8000 is power of 2 (2 raised to 15). Hence multiplication is done
> by
> > left shifting the register content by 15 times.
> > 3. Since 0x8000 is cast to 'signed short' and it should be treated as
> > negative number. So the register content is negated.
> > 4. The byte value of the result is stored to the variable 'g_scResult'.
> >
> > The negation is taking place using the 'negsi2' pattern as shown below.
> >
> > (define_insn "negsi2"
> >   [(set (match_operand:SI 0 "register_operand"         "=b")
> >         (neg:SI (match_operand:SI 1 "register_operand" " b")))]
> >  ""
> >  "subu\t%0,$0,%1"
> > )
> >
> > But, if I remove machine description for 'negsi2', GCC is able to
> optimize
> > the multiplication operation. GCC uses subtraction pattern for
> performing
> > the negation.
> >
> > The Instruction combination pass is able to combine all the instructions
> > into one as shown below.
> >
> > (----- Snip starts -----)
> > (insn 11 10 0 2 ../test/test.c:6 (set (mem/c/i:QI (symbol_ref:SI
> > ("g_scResult") <var_decl 0xb7cf805c g_scResult>) [0 g_scResult+0 S1 A8])
> >         (const_int 0 [0x0])) 28 {*movqi} (nil))
> > (----- Snip ends -----)
> >
> > I am not able to understand why GCC is not optimizing the result of
> > operation to 0 when 'negsi2' is defined.
> >
> > Please help.
> 
> Look at what combine does (or tries to do). Use
-fdump-rtl-combine-details.
> 
> Probably costs are not stated approprialtely.
> 
> Georg-Johann