On Tue, 20 Jan 2015, Matthew Fortune wrote: > > > What this shows really is a GAS bug fix for the SUB macro is needed > > > similar to what I suggested in 12/70 for ADDI (from the situation I > > infer > > > there is some real work to do in GAS in this area; adding Matthew as a > > > recipient to raise his awareness) so that it does not expand to ADDI > > where > > > the architecture or processor selected do not support it. Instead a > > > longer sequence involving SUB has to be produced. > > The assembler is at least consistent at the moment as the 'sub' macro is > disabled for R6. I am very keen to stop carrying around historic baggage > where it is not necessary. R6 is one place we can do that and deal with > any code changes that are required. I have yet to be convinced it is merely historic baggage. Maybe it's a matter of habits I got into, but I find the presence of these macros a way to make the MIPS assembly language actually usable for handcoding. There are several reasons for this. One is the limited range of immediates in machine makes it necessary to use different instruction sequences for different immediate input arguments. Given this source code instruction: li $2, foo for different values of `foo' you'll get different machine code: foo code 0x1234 addiu $2, $0, 0x1234 0x89ab ori $2, $0, 0x89ab 0x89ab0000 lui $2, 0x89ab 0x89ab1234 lui $2, 0x89ab; addiu $2, $2, 0x1234 now if `foo' is some sort of an externally supplied constant (e.g. set with a `configure' script or whatever), then without the macros you'd have to pessimise code, or clutter it with #ifdef's. Another is to abstract ABI dependencies. Again, given this source code instruction: lw $2, foo for different ABIs you'll get different code: ABI code o32/non-PIC lui $2, %hi(foo); lw $2, %lo(foo)($2) o32/PIC/extern lw $2, %got(foo)($28); lw $2, 0($2) o32/PIC/local lw $2, %got(foo)($28); addiu $2, %lo(foo); lw $2, 0($2) n64/non-PIC lui $1, %highest(foo); lui $2, %hi(foo); addiu $1, $1, %higher(foo); dsll32 $1, $1, 0; daddu $1, $1, $2; lw $2, %lo(foo)($1) n64/PIC/extern ld $2, %got_disp(foo)($28); lw $2, 0($2) [...] You'd have to conditionalise it all too. And there are more cases macros address, e.g. to make the complete set of arithmetic conditions available for branches (with the use of SLT and SLTU instructions), extra operations (e.g. NOT as a shorthand for NOR), three-argument trapping MULOU, DIVU, REMU operations (especially interesting to note in the context of r6; why MODU wasn't consequently called REMU for portability escapes me), etc. All this makes assembly language programming easier and more like with CISC assembly languages, e.g. this x86 assembly-language instruction: addl $foo, %eax will do the right thing for any value of `foo' and the assembler will also pick the shortest instruction encoding available. As a result when writing code you can focus on the problem you're trying to solve rather than getting distracted by ABI peculiarites or the assymetry of the machine instruction set. It is also easier to follow when studying code written by someone else. Of course all this does not matter for compiler-generated code. Which is also the reason why the MIPS16 assembly language has never included a complementing set of these macros -- it was only meant to be used in compiler-generated code and never for handcoding. And for handcoded assembly if you are concerned about source code instructions expanding into multiple machine instructions, then you can always stick `.set nomacro' at the top of your source code. > > > __asm__ __volatile__( > > > "1: ll %1, %2 # arch_read_unlock \n" > > > " sub %1, %3 \n" > > > " sc %1, %0 \n" > > > : "=" GCC_OFF12_ASM() (rw->lock), "=&r" (tmp) > > > : GCC_OFF12_ASM() (rw->lock), GCC_ADDI_ASM() (1) > > > : "memory"); > > > > > > (untested, but should work) so that there's still a single instruction > > > only in the LL/SC loop and consequently no increased lock contention > > risk. [...] > > (Note this asm block does not appear to need to clobber memory either as > the effects on memory are correctly stated in the constraints). The `memory' clobber serves the purpose of an optimisation barrier here, it's not about the memory accesses happening within the asm itself. > > > As a side note, this could be cleaned up to use a "+" input/output > > > constraint; such a clean-up will be welcome -- although to be complete, > > a > > > review of all the asms will be required (this may bump up the GCC > > version > > > requirement though, ISTR bugs in this area). > > I believe some of these asm blocks using ll/sc already have '+' in the > constraints for the memory location so perhaps that is either already > a problem or not an issue. I just don't remember offhand if the use of `+' was in platform or in shared code. If the latter, then let's just switch, if the former, we need to be careful. IIRC some versions of GCC complained and failed compilation if the list of constraints associated with `+' did not allow a register alternative, such by including the `r' constraint. Which of course would be completely pointless here, and actually harmful. Furthermore IIRC it had been a deliberate decision made by GCC maintainers who were unaware of some use cases for inline asms. The decision was then discussed and GCC maintainers persuaded to change it; it can likely be tracked down in a mailing list archive somewhere. Maciej