Performance problem: unaligned loads/stores on structure assignments on MIPS

Simon Kagstrom <simon.kagstrom@xxxxxx> · Sun, 25 Feb 2007 19:21:36 +0100

Hello!

GCC 4.1 seems to sometimes generate inefficient code when doing
structure assignments directly when compiling for MIPS1. When assigning
to structure members manually, it generates regular lw/sw sequences.
When assigning to the structure, you instead get lwl/lwr and swl/swr
pairs (for no reason, since the data is aligned).

I'm translating the MIPS code to Java bytecode, so the problem is even
worse for me as I then have to read four separate bytes and or
these together to implement a lwl/lwr pair.

I've attached a preprocessed file which exhibits this behavior. The two
functions below show this, the first doing a structure assignment and
the second member assignment.

  void dummy1(NavigateSegment *segments,
             int count,
             RoadMapPosition *src_pos,
             RoadMapPosition *dst_pos) {

      int i;
      int group_id = 0;
      NavigateSegment *segment;

      for (i=0; i < count; i++) {
         set_from_pos (&segments[i].from_pos);
         segments[i].shape_initial_pos = segments[i].from_pos;

      }
   }

   void dummy2(NavigateSegment *segments,
               int count,
               RoadMapPosition *src_pos,
               RoadMapPosition *dst_pos) {

      int i;
      int group_id = 0;
      NavigateSegment *segment;

      for (i=0; i < count; i++) {
         set_from_pos (&segments[i].from_pos);
         segments[i].shape_initial_pos.longitude = segments[i].from_pos.longitude;
         segments[i].shape_initial_pos.latitude = segments[i].from_pos.latitude;
      }
   }

And if you disassemble it you see that they look like

00000014 <dummy1>:
  14:   00003021        move    a2,zero
  18:   240804d2        li      t0,1234
  1c:   08000014        j       50 <dummy1+0x3c>
  20:   2407162e        li      a3,5678
  24:   ac88001c        sw      t0,28(a0)
  28:   ac870020        sw      a3,32(a0)
  2c:   8882001c        lwl     v0,28(a0)
  30:   88830020        lwl     v1,32(a0)
  ...

00000064 <dummy2>:
  64:   00001821        move    v1,zero
  68:   240604d2        li      a2,1234
  6c:   08000022        j       88 <dummy2+0x24>
  70:   2407162e        li      a3,5678
  74:   ac870020        sw      a3,32(a0)
  78:   ac86001c        sw      a2,28(a0)
  7c:   ac86002c        sw      a2,44(a0)
  80:   ac870030        sw      a3,48(a0)
  ....

If the loop is removed, both functions generate the same code (with
regular loads/stores). The GCC version is

   mips-linux-gnu-gcc (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)

Should I consider this a bug and report it to the bug tracking system?
I looked for similar problems, but couldn't find any matching bug report.

(Ehud Shabtai discovered this problem)

-- 
// Simon
Attachment:
main.i

Description: Binary data