Re: Expert help wanted with inline assembly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Weddington, Eric schrieb:
Hi All,

My questions come from a thread on the avr-libc-dev mailing list:
<http://lists.nongnu.org/archive/html/avr-libc-dev/2009-02/msg00007.html>

For the AVR target, I need to create a macro that generates inline assembly, for the purpose of creating a specific, timed sequence of instructions that operates on a certain I/O register in the AVR to disable the brown-out detector before going into a sleep mode. This sequence is defined in the datasheets for specific AVR devices.

The I/O register that is operated on is available via the memory map. The sequence is fairly simple and is essentially: read-modify-write-modify-write. Pseudo code assembly would look something like:

in X, 53
ori  X, 96
out  53, X
andi X, -33
out  53, X

"X" would be some scratch register; I don't really care what register it is. The ORI and ANDI instructions both have a constraint that they can only work on R16-R31, which is symbolized by the "d" constraint for the AVR. The "53" happens to be the address of the I/O register in question.

I came up with this interesting (to me, at least) use of inline assembly:

#define sleep_bod_disable()  \
do { \
    __asm__ __volatile__ (   \
        "ori  %0,%1" \
        : "+d" (MCUCR) \
        : "i" (_BV(BODS) | _BV(BODSE)) \
    ); \
    __asm__ __volatile__ (   \
        "andi  %0,%1" \
        : "=d" (MCUCR) \
        : "i" (~_BV(BODSE)) \
    ); \
} while(0)

'MCUCR' is the I/O register definition, the BODS/BODSE stuff are just the bit masks required for the sequence.

First off, we can guarantee that global interrupts will be disabled when this macro is being called. The macro doesn't enforce this per se, but this macro is supposed to be used in a larger sequence where it is required that global interrupts are disabled.

The macro above correctly generates the required sequence of read-modify-write-modify-write. I was pleased at the fact that the scratch register used was correct (the same register) across both __asm__ statements. My intent was to let gcc pick the register and generate the IN and OUT instructions for me, by just using the required constraints for MCUCR with the ORI and ANDI instructions.

My concern and question:
Do I have to worry that GCC will somehow select a different register for MCUCR in the second __asm__ statement, so that it doesn't match the register in the first __asm__ statement? I would think that since I am letting gcc select the register in the first __asm__ statement and having gcc do the output, that gcc will know which register that MCUCR lives in and can match that with the second __asm__ statement. But I am unsure of whether this can be guaranteed. No other user code will go in between the two __asm__ statements, as these are contained within the single macro.

Certainly there are other ways of writing this, especially where the IN/OUT instructions are explicit in the inline assembly. But there is an advantage for writing it this way. If there will be future AVR devices where the MCUCR register is located at a different, higher address (this type of situation has happened before), then different instructions have to be used to read/write the I/O register (LD/ST). With the inline assembly implementation above, when the MCUCR register is at a higher address, gcc correctly generates the different instructions for the read/write access. If I have to change the inline assembly to explicitly write the IN/OUT instructions, then I lose this advantage.

Thoughts?

Hi Eric.

I would not rely on that. Even though it is optimal to do the reloads as you prefer them, it is not specified in gcc and ira or lreg/greg may choose other GPGs.

Best is to write one monolithic asm with the drawback that you will have to supply two flavours: one for IN/OUT and one for LDS/STS.

Second best could be to explicitly use constraint x instead of d and to clobber r27, i.e. the high part of Xreg. Thereby the compiler has to reload the value into the only GPR remaining in class x, namelx r26.

But I fear teh second solution is not bulletproof because r26 may get clobbered between the two asm statements.

Even though your version may work I think that it is not correct, even if no C context resp. switches can be found to make it explicit.

Georg-Johann


[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux