I wrote a class that switches the stack to a new area. This is for
the power PC.
In the text below, I'll use main, testit, and newStack. main is the
main program, testit is a function that main calls, and newStack is
the method that switches the stack to the new space. main calls
testit which calls s.newStack. (s is an instance of the class that
switches the stack).
The purpose of separating main and testit is so I can verify that
returning from testit works properly.
newStack gets the current value of r1 (the stack pointer) and copies
the last two stack frames (which would be the stack frame for testit
and newStack) to the top of some allocated memory. It alters r1(0)
(the previous stack value for newStack) in the new memory to point to
the address of testit's new stack frame. It sets r1 up to the base
of this new area and returns.
With g++ and no optimization, this works. When newStack returns, it
consumes its stack frame in the new memory leaving only testit's new
stack frame and r1 pointing to the base of the new stack from for
testit. When testit returns, it loads r1 with r1(0) and returns.
This properly puts r1 back to main's stack frame.
If I put -O3, then at the return of testit, instead of loading r1
with r1(0), just adds in the size of the stack frame (and assumes
that r1 has not been munged with). I presume this is faster. I know
that xlc does the same thing. As a result, when we return back to
main, the stack pointer is off in the weeds somewhere.
I suspected that somehow, alloca gave the compiler a clue that it
could not do the add, it had to load r1 with r1(0).
So, I wrote a macro:
#define doNewStack(s) \
do { \
void *notUsed = alloca(1); \
s.newStack(); \
} while (0)
testit is changed to call doNewStack(s); where it use to call
s.newStack(); (The purpose of the do while is so it is a single C
statement.)
This does as I hoped. It flags the compiler and tells it that r1 has
been munged. As a result, it loads r1 from r1(0) for the return of
testit and does not do the add immediate.
I would like to do the same thing without using a macro: give
newStack an attribute that tells the compiler that r1 has been
munged. I looked and did not see any attribute that looks like it
applied but I thought I would ask and see.
The other danger I am worried about is if testit is inlined. It
seems like that would/could hose me up as well but, I'm not sure.
I wrote my class and put the implementation of newStack in a separate
file so it can not be inlined. I suppose a completely different
approach would be to move the implementation of newStack back into
the class definition and give it the inline attribute so it is always
inlined. Then change it to only copy one stack frame. It could do
the alloca too so that the compiler would load r1 with r1(0) rather
than doing the add. Hmmmm...
I'm get the feeling that I am reinventing the wheel here.
Any thoughts or help here would be great!
Thank you for your help,
Perry Smith ( pedz@xxxxxxxxxxxxxxxx )
Ease Software, Inc. ( http://www.easesoftware.com )
Low cost SATA Disk Systems for IBMs p5, pSeries, and RS/6000 AIX systems