Is 4 byte -1 invalid code on most/all architectures?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi. This is off topic, but I know there is expertise here and I'm a little lazy.


We have a system where we need to read the code/data
a function pointer points to, and if it contains
a particular marker, treat it differently.


The marker we use is a pointer-sized -1.
On some targets, we also first check the alignment of the function pointer.
ie: on targets with fixed size 4 byte instructions, 8 byte pointers, where
alignment is checked, if the pointer is only 4-byte aligned, we don't do
the read and assume the pointer is to "regular code".
e.g. ppc64, sparc64, hppa64, mips64 if I recall correctly, which
all have 4 byte instructions and require 8 byte reads to be 8 byte aligned.


My question is two part:
We target a moderate list of processors:
tier 1: sparc, x86, powerpc
tier 2: alpha, mips
possible future: ia64, hppa


Is -1 invalid code on all these?
4 bytes? 8 bytes?


I'm somewhat keen on removing the alignment check and just always checking 4 bytes.
The less target-dependent code, the better.
Or I could make the marker size and contents both target-specific, but set them
to a 4 byte -1 and probably forget about it.


I'm also most leary of IA64 which I know has fixed size 128 instruction bundles,
instructions are 41 bytes each.


On Windows I can easily enough test in a debugger poking bytes into memory
and disassembling:
cdb cmd
0:000> eb . ff ff ff ff ff ff ff
0:000> u .
00000000`770b1220 ff ???
00000000`770b1221 ff ???
00000000`770b1222 ff ???
00000000`770b1223 ff ???
00000000`770b1224 ff ???
00000000`770b1225 ff ???


My *nix skills are not so advanced.


I tried:
int a = -1;
int main() { }


and then disassembly a in gdb but it won't since it isn't a function.
I'll try other ways later.


I'm not the inventor of this mechanism and I was somewhat aghast to find it,
due its dependency on -1 being invalid code.
However upon much further thought, it is a tough problem and this doesn't
seem like a terrible compromise.


This is related to implementation of "nested functions", and this mechanism
allows us to avoid runtime code generation (ie: codegen on the stack).


There are major advantages and disadvantages either way.


We do control all the calls of the function pointers, so we don't have to
be C-compatible. This saves us from e.g. having to mprotect on OpenBSD
or whatever various platforms require, including there being no option
as far as I know for e.g. iPhone (not that we are a mainline or even in-use
development tool for iPhone...)


The -1 is followed by a frame pointer and a function pointer.
So we either jump the pointer itself, or load the frame pointer and jump to the
other pointer.


Thanks,
- Jay 		 	   		  



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux