That's great. Its not quite fixed yet. You can't turn off alignment where it would result in unexecutable code. If I recall my x86 assembly, labels have to be word aligned, while the next instruction doesn't always have to be. Depending on the distance to the alignment required, an unconditional jmp might be a short cut. Adding the nop just removed the oportunity for an unconditional jmp. You'll have to find what happens for every case of misalignment. You might still need to take care to only insert the nop when necessary. That is, if your last instruction just happens to come perfectely aligned, the nop might cause more another unconditional jmp to be inserted. Still, it occurs to me that the JMP, if it were position independent, and the next buffer is always at the right alignment (the target of the JMP), should cause no trouble. Try adding -PIC to your compiler options. --Dean On Wed, 2 Sep 2009, Robert Bernecky wrote: > Hi, Dean. > > My initial attempt at compiler options was just -O0. > > That resulted in the jmp insertion problem, so I conjectured > that there might be some alignment requirements/desires that > would result in jmp instructions being added to make each > labeled fragment start on an "appropriate" boundary. > Clearly, the no-align options did not help. > > So, I just tried out your suggestion: > > #define OP(nm, cod) \ > FS##nm: cod \ > asm("nop" : : ); \ > FE##nm: > > This has the effect of inserting a NOP at the end of each code fragment. > And, it DOES appear to work (although I just quickly eyeballed > the asm code, so I might be missing something). I'll give > it more a careful workover tomorrow. (That was WITH the current > -noalign options still active.) > > Now, what was it that led you to propose that inserting a NOP > would have the desired effect? > > Many thanks for your reply! > Robert > > Dean Anderson wrote: > > I suspect it does this because of instruction alignment and pipelining > > issues. Why are you trying to turn off alignment? > > > > You might try adding a nop after each one. > > > > --Dean > > > > On Tue, 1 Sep 2009, Robert Bernecky wrote: > > > >> I'm trying to get gcc version 4.3.2 to emit X86-64 code > >> fragments that I can catenate to perform my own JIT > >> compilation, but the compiler is being recalcitrant. > >> > >> (I was using a jump table, but its performance was underwhelming.) > >> > >> Roughly, what I've done is to create a set of code fragments, > >> with labels so that I can determine their address ( via &&label) > >> and length. E.g., > >> > >> topLoad1: reg1 = x[i]; > >> botLoad1: > >> > >> topLoad2: reg2 = y[i]; > >> botLoad2: > >> > >> topAdd: regz = reg1 + reg2; > >> BotAdd: > >> > >> topStore: z[i] = regz; > >> botStore: > >> > >> Then, I have a table of fragment addresses (topLoad1, topLoad2, etc.) > >> and lengths (botLoad1-topLoad1, botLoad2-topLoad2), and a > >> (unknown statically) list of fragments to be assembled to build > >> working code, e.g.: > >> > >> (Load2, Load1, Add, Store, Loop) > >> > >> I assemble the fragments into a code buffer and jump to it, > >> or so the story goes. Unfortunately, what I'm seeing in the > >> generated code fragments is not fun: > >> > >> 1. GCC sometimes, but NOT always, inserts jumps to the next > >> fragment. E.g.: > >> > >> ---------------------------------------------- > >> > >> .L46: > >> .loc 2 34 0 > >> movq -264(%rbp), %rax > >> movq %rax, -40(%rbp) > >> .L47: > >> .L7: > >> .loc 2 40 0 > >> movl %r8d, %eax > >> jmp .L48 > >> .L6: > >> .L48: > >> .loc 2 43 0 > >> movl %r11d, %ecx > >> .L49: > >> .L50: > >> ---------------------------------------------- > >> > >> Note the jmp .L48. If GCC always inserted a jump, I could > >> remove it, or if it never inserted the jump, I'd be even > >> happier, but it only does it now and then. I tried adding > >> my own jumps to force this: > >> > >> topLoad2: reg2 = y[i]; > >> goto botLoad2; > >> botLoad2: > >> > >> but GCC removed them. And inserted others. > >> > >> Today, I'm using these compiler options: > >> > >> gcc -O0 -ggdb -mtune=opteron -fno-align-labels -fno-align-jumps > >> > >> So, I welcome suggestions on how to solve or work around these > >> problems. Or even a completely different approach. > >> > >> Thanks, > >> Robert > >> > >> > >> > >> > > > > > -- Av8 Internet Prepared to pay a premium for better service? www.av8.net faster, more reliable, better service 617 256 5494