Re: inline asm: How to push PIC reg before seven input operands get loaded?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 24/03/07 18:16:51, Andrew Haley wrote:
>  > If I do eg:
>  > 
>  > Pixel16 *tileData[4]; uint16 *pDestPixel; Pixel16 *dataPtr, *transDataPtr;
>  > sint32 startX, endX;
>  > __asm__ __volatile__ (
>  > "push %%ebx     \n\t"  // save reg for PIC!
>  > //code here using %1 but now reg can be rused:
>  > 
>  > "movl %5, %1            \n\t" //no reg left for %5 when gcc should handel it
>  >                               //but now gcc says: 
>  >                               //error: memory input 5 is not directly addressable
>  > "movl (%1,%%ebx,4), %1 \n\t"  //instead of "movl (%5,%%ebx,4), %1 \n\t" 
>  >                               //because m for %5 didn't work here
>  > 
>  > //more code using also ebx and ebp
>  > "pop %%ebx     \n\t" // restore reg for PIC!
>  > :
>  > : "D" (pDestPixel), "d" (endX), "c" (startX), "a" (transDataPtr), "S" (dataPtr), "m" (tileData)
> 
>  > : "cc" //don't tell gcc about what was done to ebx because of PIC!!!
>  > );
>  > 
>  > gcc complains about: error: memory input 5 is not directly addressable
> 
>  > Why is that? How else should I load the tileData pointer from
>  > memory into a register?
> 
> I wouldn't be using the "m" constraint for this.  Pass in the pointer
> as an operand using constraint "r" or "g".  Something like "g"(&tileData);
> 
> Can you post some small sample code?  Make it the shortest possible
> programs that shows your problem.  Make sure all the types it needs
> are defined.

This  is  a problem. My task is to tranlate this inline asm written for MSC to
gcc. It's from a big game (Call to Power II, CTP2). I've attached a file  with
the  function  that  contains  the  code.  But  it  won't compile withou a lot
modifications outside the source tree.

At L131 my translation starts. It has the constrains I got it to compile  with
but  segfaults  then.  I  noticed that using any other var than tileData works
with "m". Could it be a problem for  gcc  to  use  the  double  pointer  here:
Pixel16 **tileData ?

>
>   >  >   >  Also I wonder which -f option that comes with -O2 makes gcc stop
reporting:
>  > >  > error: can't find a register in class 'LEGACY_REGS' while  reloading
'asm'
>  > >
>  > > I guess this is a joke.
>  >
>  > No, it isn't. I wouldn't have asked else wise...
>
> Reload has run out of registers.  It's telling you that it needs
> another one.  A -f option is not going to give reload another
> register.

The  point  is,  if  I  just add -O2 to the g++ cammand, this error disappears
without having changed anything in the code or the number of register the part
uses.
It  seems  that  an  option  in -O2 or above lets g++ be able to cope with the
registers I have left it (ebx and ebp, esp). I found this solution quite often
on the net.

>
>  > Is it the combination of all -f options in -O2? I coudn't find a
>  > single obviouse one in the manpage.  Is it a rule then to include
>  > -O2 or higher if inline asm is used???
>
> No, there's no such rule.  You've run out of registers.
>
> gcc needs one or two spare registers to work with.  The x86 has eight,
> some of which are used for special purposes.  You are using five in
> your asm.  This is a very tight situation.

Compiling all this with -fomit-frame-pointer:

Can gcc not push all registers, and give me all seven without esp?

I  also  tryed  to  use  "R"  instead of "m" for tileData. It compiled without
errors even when I had specified -fPIC. -fPIC seems to use ebx so I pushed and
poped  that  in  my asm code. But when looking at the disassembly I had to see
gcc used ebx for tileData and not ebp. But if I  specified  "b"  it  complains
about:
error: PIC register %ebx clobbered in asm

Is  this a bug in gcc??? Is there a constraint just for ebp so I don't have to
use "R" and hope it gets into ebp?

>
> Andrew.

Thanks Andrew
Lynx
compiled with:

g++ -DHAVE_CONFIG_H -I. -I. -I../../../ctp2_code/os/include -I../../../ctp2_code/os/nowin32 -I../../../ctp2_code -I../../.. -I../../../ctp2_code/os/include -I../../../ctp2_code/ctp -I../../../ctp2_code/ctp/ctp2_utils -I../../../ctp2_code/ctp/ctp2_rsrc -I../../../ctp2_code/ctp/debugtools -I../../../ctp2_code/ui/interface -I../../../ctp2_code/ui/netshell -I../../../ctp2_code/robot/utility -I../../../ctp2_code/robot/pathing -I../../../ctp2_code/robot/aibackdoor -I../../../ctp2_code/gfx/spritesys -I../../../ctp2_code/gfx/tilesys -I../../../ctp2_code/gfx/gfx_utils -I../../../ctp2_code/gs/database -I../../../ctp2_code/gs/fileio -I../../../ctp2_code/gs/gameobj -I../../../ctp2_code/gs/utility -I../../../ctp2_code/gs/world -I../../../ctp2_code/net/io -I../../../ctp2_code/net/general -I../../../ctp2_code/ui/aui_utils -I../../../ctp2_code/ui/aui_sdl -I../../../ctp2_code/ui/aui_directx -I../../../ctp2_code/ui/aui_ctp2 -I../../../ctp2_code/ui/aui_common -I../../../ctp2_code/libs/anet/h -I../../../ctp2_code/mm -I../../../ctp2_code/robotcom/backdoor -I../../../ctp2_code/gs/slic -I../../../ctp2_code/gs/slic -I../../../ctp2_code/gfx/layers -I../../../ctp2_code/mapgen -I../../../ctp2_code/ui/freetype -I../../../ctp2_code/sound -I../../../ctp2_code/GameWatch/gamewatch -I../../../ctp2_code/GameWatch/gwciv -I../../../ctp2_code/ctp/fingerprint -I../../../ctp2_code/ui/slic_debug -I../../../ctp2_code/gs/outcom -I../../../ctp2_code/ctp -I../../../ctp2_code/ui/aui_common -I../../../ctp2_code/ui/ldl -I../../../ctp2_code/ui/ldl -I../../../ctp2_code/gs/events -I../../../ctp2_code/gs/newdb -I../../../ctp2_code/gs/newdb -I../../../ctp2_code/ai/diplomacy -I../../../ctp2_code/ai/mapanalysis -I../../../ctp2_code/ai/strategy/scheduler -I../../../ctp2_code/ai/strategy/agents -I../../../ctp2_code/ai/strategy/goals -I../../../ctp2_code/ai/strategy/squads -I../../../ctp2_code/ai -I../../../ctp2_code/ai/CityManagement -I/usr/local/include/SDL                           -D_GNU_SOURCE=1 -D_REENTRANT -I/usr/X11R6/include -Wall -Wno-unused-variable -fms-extensions -fmessage-length=0 -frtti -fexceptions -g -O2 -fomit-frame-pointer -MT tiledraw.lo -MD -MP -MF .deps/tiledraw.Tpo -c tiledraw.cpp  -fPIC -DPIC -fleading-underscore -o .libs/tiledraw.o



void TiledMap::DrawTransitionTileScaled(aui_Surface *surface, const MapPoint &pos, sint32 x, sint32 y, sint32 destWidth, sint32 destHeight) {
    Pixel16		*dataPtr;
    sint32		x, y;
    sint32		startX, endX;

    TileInfo	*tileInfo;
    BaseTile	*baseTile, *transitionBuffer;
    uint16		index;
    Pixel16     *transData, *transDataPtr;
    static Pixel16 defaultPixel[4] = {0xf800, 0x07e0, 0x001f, 0xf81f};

    if (!surface) surface = m_surface;

    ypos+=k_TILE_PIXEL_HEADROOM;

    
    if (xpos < 0) 
        return;
    if (xpos > surface->Width() - k_TILE_PIXEL_WIDTH) 
        return;
    if (ypos < 0) 
        return;
    if (ypos > surface->Height() - k_TILE_PIXEL_HEIGHT) 
        return;

    tileInfo = GetTileInfo(pos);
    Assert(tileInfo != NULL);
    if (tileInfo == NULL) 
        return;

    index = tileInfo->GetTileNum();

    baseTile = m_tileSet->GetBaseTile(index);
    if (baseTile == NULL) 
        return;



    Pixel16 *data = baseTile->GetTileData();

    Pixel16	*tileData[4];

    sint32 tilesetIndex = g_theTerrainDB->Get(tileInfo->GetTerrainType())->GetTilesetIndex();

	
    uint16 tilesetIndex_short = (uint16) tilesetIndex;

#ifdef _DEBUG
    Assert(tilesetIndex == ((sint32) tilesetIndex_short));
#endif

    tileData[0] = m_tileSet->GetTransitionData(tilesetIndex_short, tileInfo->GetTransition(0), 0);
    tileData[1] = m_tileSet->GetTransitionData(tilesetIndex_short, tileInfo->GetTransition(1), 1);
    tileData[2] = m_tileSet->GetTransitionData(tilesetIndex_short, tileInfo->GetTransition(2), 2);
    tileData[3] = m_tileSet->GetTransitionData(tilesetIndex_short, tileInfo->GetTransition(3), 3);
	
    transitionBuffer = m_tileSet->GetBaseTile(static_cast<uint16>((tilesetIndex * 100) + 99));
    if(transitionBuffer) {
        transData = transitionBuffer->GetTileData();
        transDataPtr = transData;
	} else {
            transData = NULL;
            transDataPtr = NULL;
            }

    dataPtr = data;

    uint8 *pSurfBase;


    pSurfBase = m_surfBase;
    sint32 surfWidth = m_surfWidth;
    sint32 surfHeight = m_surfHeight;
    sint32 surfPitch = m_surfPitch;

    Pixel16 srcPixel, transPixel = 0;

    uint16 *pDestPixel = (Pixel16 *)(pSurfBase + ypos * surfPitch + 2 * xpos);
	{
        for (y=0; y<k_TILE_PIXEL_HEIGHT; y++) {
            if (y<=23) {
                startX = (23-y)*2;
                endX = k_TILE_PIXEL_WIDTH - startX;
                } else {
                    startX = (y-24)*2;
                    endX = k_TILE_PIXEL_WIDTH - startX;
                    }
            if (transDataPtr)
                {
#ifdef _MSC_VER             //use this if __asm__ is used
                _asm {
                    mov edx, endX
                        mov edi, pDestPixel
                        mov esi, dataPtr
                        mov ecx, startX
                        lea edi, [edi + 2*edx]
                        sub ecx, edx
                        mov ebx, transDataPtr
				
                        L0:
                    mov dx, [esi]
                        add esi, 2
                        xor eax, eax
                        mov ax, dx
                        cmp eax, 4
                        jge L1
                        mov edx, tileData[4*eax]
                        test edx, edx
                        jz L2
                        add	edx, 2
                        mov tileData[4*eax], edx
                        mov dx, [edx-2]
                        jmp L1
                        L2:
                    mov dx, [ebx]
                        L1:
                    add ebx, 2
                        mov [edi + 2*ecx], dx
                        inc ecx
                        jnz L0
                        mov transDataPtr, ebx
                        mov dataPtr, esi
                        }
#else      
                __asm__ (
//                                "movl $endX, %edx            \n\t" //done by gcc!
//                                "movl $pDestPixel, %edi      \n\t" //done by gcc!
//                                "movl $dataPtr, %esi         \n\t" //done by gcc!
//                                "movl $startX, %ecx          \n\t" //done by gcc!
                    "leal (%2,%3,2),%2     \n\t"//load value %0 + %1 * s is pointing to
                    "subl %3,%4              \n\t" //%1 is now reuseable
//                                "movl $transDataPtr, %ebx    \n\t" //done by gcc!

                    ".L0:                        \n\t"
                    "movw (%1),%%dx             \n\t" //reusing edx (%3)
                    "addl $2,%1                \n\t"
//                                    "movl %%eax, %4                  \n\t"
                    "movl %0, %%ebp                  \n\t"
                    "pushl %%ebx     \n\t"  // save reg for PIC!
                    "xorl %%ebx,%%ebx              \n\t" //make %eax = 0, see below, we use ebx
                    "movw %%dx,%%bx            \n\t" //because of this ebx has to be static
                    "cmpl $4,%%ebx             \n\t" //because of above line eax may be != 0
                    "jge .L1                     \n\t"
                    "movl (%%eax,%%ebx,4), %3 \n\t"  // ** tileData can't be passed as "m"!
                    "testl %3,%3             \n\t"  //check if %3 is zero, set Z-bit
                    "jz .L2                      \n\t"
                    "addl $2,%3            \n\t"
                    "movl %3, (%%eax,%%ebx,4) \n\t" // (%eax + %ebx * 4) = %3
                    "movw -2(%3),%%dx           \n\t"
                    "jmp .L1                     \n\t"
                    ".L2:                        \n\t"
                    "movw (%%ebp),%%dx             \n\t"//movw (%3),%%dx
                    ".L1:                        \n\t"
                    "addl $2,%%ebp                \n\t"
                    "movw %%dx,(%2,%4,2)      \n\t"
                    "incl %4                   \n\t"
                    "jnz .L0                     \n\t"
                    "popl %%ebx     \n\t" // restore reg for PIC!
                    "movl %%ebp, %0    \n\t"
//                                        "movl %esi, $dataPtr         \n\t" //does gcc
                    : "=m" (transDataPtr), "=S" (dataPtr) // "=D" (pDestPixel)
                    : "D" (pDestPixel), "d" (endX), "c" (startX), "0" (transDataPtr), "1" (dataPtr), "a" (tileData)
                    : "%ebp", "cc" //don't tell gcc about what was done to ebx!!!
                    );


#endif             //use this if __asm__ is used
                }
            else
                {
                for (x = startX; x<endX; x++) 
                    {
                    srcPixel = *dataPtr++;
                    if (srcPixel < 4)
                        {
                        Pixel16 *tile = tileData[srcPixel];
                        if (tile != NULL)
                            {
                            tileData[srcPixel]++;
                            srcPixel = *tile;
                            }
                        else
                            {
                            srcPixel = defaultPixel[srcPixel];
                            }
                        }
                    pDestPixel[x] = srcPixel;
                    }
                }
            pDestPixel += (surfPitch>>1);
            }
	}
    }

[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux