Help needed: Optimization of bytecode interpreter for ARM paltform

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I hope that this is the best location to ask this question, if not, please accept my apologize and redirect me where needed.

I am trying to write a fast byte code interpreter, but the compiler optimizer just 'does not get it' and generates bad code (it does not realize that they are jumps everywhere and optimizes out the code out)...

Here is a simplified version of the code:

static int rom[]= 
  { 0, 1, 2, 3, 4, 5, 6, 7, 8, 
    9, 10, 11, 12, 13, 14, }; // the 'program'
 

void execute()

{

  const void * const jumps[] = 
    { &&ins000, &&ins001, &&ins002, &&ins003, 
      &&ins004, &&ins005, &&ins006, &&ins007 }; // table of jumps

  register int carry asm ("r0");
  register int instr asm("r1"); // currently executed instruction
  register int *pc asm ("r4"); // program counter, points on next instr.
  register const void * const * jm asm ("r5") = jumps; //pointer jump table

int a=0, b=0; // virtual machine registers

// this macro does a fast carry=0; goto *jumps[*pc++]; 
#define next asm ("ldrh %2, [%0], #2\n\t" \
                   "mov %1, #0\n\t" \
                   "ldr pc, [%4, %2, asl #2]" : 
                   "=r" (pc), 
                   "=r" (carry), 
                   "=r" (instr): 
                   "0" (pc), 
                   "r" (jm)) 

// this macro does a fast goto *jumps[*pc++]; 
#define nextnocarry asm ("ldrh %1, [%0], #2\n\t"\
                         "ldr pc, [%3, %1, asl #2]" : 
                         "=r" (pc), 
                         "=r" (instr) : 
                         "0" (pc), 
                         "r" (jm))

 
pc = &rom[0]; next; // initialize PC and jump on first instruction...

// instruction execution..
ins000: a= 0; next;
ins001: b= 0; next;
ins002: a++; carry= a==0; nextnocarry;
ins003: b++; carry= b==0; nextnocarry;
ins004: pc= pc-a; next;
ins005: if (carry) pc+= b; next;
ins006: a--; carry= a==0; nextnocarry;
ins007: b--; carry= b==0; nextnocarry;

}


arm-elf-gcc -O1 -S ex.c compiles this whole thing in absolutely NOTHING! (well, a bx lr to be more precise, a return!).
I am using version "arm-elf-gcc (GCC) 4.0.2"

Can anyone help me with this?

Note, if I replace the asm parts with the C equivalent, it generates:
      ldrh  r1, [r4], #2
      ldr   r8, .L2691+4
      ldr   fp, [r8, r1, asl #2]
      mov   r0, #0
      mov   pc, fp      @ indirect register jump
5 instructions instead of 3 as it
  1: does not keep jm in a register
  2: load the value of the label in a temp register instead of directly in pc. Which is not only slower, but wastes a lot of memory (and I am very memory limited on this system).

thanks, cyrille 





[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux