Re: Non-optimal code generated for H8

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ok, I have looked more into the problem. For now I have a decent work around in my code using "volatile" and an almost working patch for GCC. I am not sure if I have time to work more on the GCC patch right now.

But just to sum up, here is what I found so far. Given the code

struct s {
    char a, b;
    char c[11];
} x[2];

void test(int n)
{
    struct s *sp = &x[n];

    sp->a = 1;
    sp->b = 1;
}

For plain H8/300 (no specific options) and H8S (-ms) GCC does not find a suitable multiplication instruction and instead generates a call to __mulhi3/__mulsi3. For some reason CSE is not able to do its work so the pointer value is recalculated for each access. H8300 has no multiplication instruction. H8S has only a 16x16 => 32 bit multiplication (mulxu.w).

    stm.l    er4-er5,@-er7
    mov.w    r0,r4
    extu.l    er4
    sub.l    er1,er1
    add.b    #13,r1l
    mov.l    er4,er0
    jsr    @___mulsi3
    mov.b    #1,r5l
    mov.b    r5l,@(_x,er0)
    sub.l    er1,er1
    add.b    #13,r1l
    mov.l    er4,er0
    jsr    @___mulsi3
    mov.b    r5l,@(_x+1,er0)
    ldm.l    @er7+,er4-er5
    rts

For H8/300H in 16-bit mode (-mh -mn) and H8SX (-mx) GCC finds the appropriate instructions and CSE does its work. Only one mulxu.w/mulu.l is generated and the pointer value is reused.

If I add a suitable multiplication insn that generates the call to __mulsi3 for H8S GCC will use it and CSE will work. "b" and "e" are new constraints that require r0 and r1. The problem is that this insn collides with the already exsiting insn for H8SX.

(define_insn "mulsi3"
  [(set (match_operand:SI 0 "register_operand" "=b")
        (mult:SI (match_operand:SI 1 "register_operand" "%0")
         (match_operand:SI 2 "register_operand" "e")))]
  "TARGET_H8300S"
  "jsr\\t@___mulsi3"
  [(set_attr "length" "2")
   (set_attr "cc" "set_zn")])

---

    extu.l    er0
    sub.l    er1,er1
    add.b    #13,r1l
    jsr    @___mulsi3
    add.l    #_x,er0
    mov.b    #1,r2l
    mov.b    r2l,@er0
    mov.b    r2l,@(1,er0)
    rts

If I also add the following I get almost perfect code. "small_operand" is a new predicate that requires a 16-bit value.

(define_insn "*mulhisi3"
  [(set (match_operand:SI 0 "register_operand" "=r")
    (mult:SI (sign_extend:SI (match_operand:HI 1 "register_operand" "%0"))
         (match_operand:SI 2 "small_operand" "")))]
  "TARGET_H8300H || TARGET_H8300S"
  "mov.w\\t%T2,%e0\;mulxs.w\\t%e0,%S0"
  [(set_attr "length" "2")
   (set_attr "cc" "none_0hit")])

---

    mov.w    #13,e0
    mulxs.w    e0,er0
    add.l    #_x,er0
    mov.b    #1,r2l
    mov.b    r2l,@er0
    mov.b    r2l,@(1,er0)
    rts

Thanks for the help.

/Mikael






[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux