Hi Ian, Thanks again. I do agree that my costs calculation do not recurse into the sub-expressions, but I was just mimicing other ports in this regard (see rs6000 for instance) . They do not recurse into sub-expressions as well. Also, I think I traced the problem to fold-const.c::fold_single_bit_test() /* If we have (A & C) != 0 or (A & C) == 0 and C is a power of 2, then fold the expression into shifts and logical operations. */ tem = fold_single_bit_test (code, arg0, arg1, type); if (tem) return tem; Commenting the above code out, makes the compiler spit out the right expression -- direct bitwise AND. Comments at the top of the function also seem to indicate that this is considered an optimization! haven't dug into this to see why this was considered so. This constant folding might be suboptimal depending on features available on the target machine. I have also checked that GCC 4.0.1 code has not changed in this behavior. The same transformation is present in fold-const.c Please advise. On a related note, it looks the if-conversion phase as well, chooses to use a bunch of shifts and other logical operations to convert a branch into linear code. This is fine, except that the resultant operations prove more expensive than a branch. I will take a closer look at the IFCVT_* macros and see if one of them is useful for catching these conditions. thanks, Matt On 13 Apr 2006 20:21:41 -0700, Ian Lance Taylor <ian@xxxxxxxx> wrote: > "Matt Lee" <reachmatt.lee@xxxxxxxxx> writes: > > > Here is what I have, > > > > case AND: > > case IOR: > > case XOR: > > *total = COSTS_N_INSNS (1); > > return true; > > case ASHIFT: > > case ASHIFTRT: > > case LSHIFTRT: > > if (GET_CODE (XEXP (x, 1)) == CONST_INT) > > *total = COSTS_N_INSNS (INTVAL (XEXP (x, 1))); > > return true; > > > > > > This looks quite OK to me. I tried debugging if rtx_costs was doing > > something wrong. I do see rtx_cost being invoked for the lshiftrt > > expressions in question, but never for the "and". Seems like GCC had > > pre-decided to only use lshiftrt even though it is expensive. > > Your computations look wrong to me. You are saying that an AND always > costs 1 instruction. But you need to actually look at the operands. > Specifically, the compiler is going to compare these two: > (and (reg ...) (const_int ...) > (and (ashiftrt (reg ...) (const_int ...)) (const_int 1)) > > > Any other ideas? Btw, I couldn't find prefer_and_bit_test in dojump.c. > > I'm looking at the current version of gcc. prefer_and_bit_test was > added 2004-03-20, so I guess that just missed 3.4. I don't have 3.4 > sources handy. In general the decision is made in the do_jump > function. > > Ian >