Goal: I would like to update a toolchain originally based on the 3.3
branch. Right now I'm just trying to get our project to run under
3.4.6. (Taking things one at a time -- if that works, then I'll try
4.0, and so on)
Difficulty: i'm trying to link against the Sony Aibo robot's system
software, which is not open source, so we can't just recompile
everything from scratch. However, I know about setting the ABI
version and have things *basically* working except for relocation
table entries for the eh_frame section causing trouble.
Background: The Aibo uses a MIPS processor, the toolchain is
targeting mipsel-linux. I'm using binutils 2.15 and newlib 1.15.
The final executable is an ELF binary, but uses some custom section
layouts. Sony provides some perl scripts which generate these
through a combination of ld scripts and generating tables as C code.
Of interest is how they handle the relocation table:
First they do a "normal" link (which they call the 'nosnap' file)
$ld -o nosnap.elf $libs
Then they link again using the same flags except adding '-r' to
generate relocation tables in a "rel" file
$ld -r -o rel.elf $libs
Their script then reads through *all* of the relocation entries in
the "rel" file ($readelf -r rel.elf), and for each of the R_MIPS_32
(and only R_MIPS_32) entries, pulls 4 bytes from the specified offset
of the "nosnap" file, then records this value and the offset itself
into a big table. It then later compiles and stores the table as a
section in the final "snap" executable. My guess is that the robot's
loader adds the process's offset in memory to all of the relocation
table entries
The problem that I'm seeing moving from gcc 3.3.6 to 3.4.6 (and
apparently, beyond, I've tried this with 4.0.4 and 4.1.2 as well) is
that the eh_frame section used to consist entirely of R_MIPS_PC32
entries, which were ignored and left out of Sony's relocation table.
However, gcc 3.4+ generates R_MIPS_32 entries in eh_frame. By
itself, this seems fine, except that the size of the .eh_frame
section in the "nosnap" file doesn't match the size of the .eh_frame
in the "rel" file, the latter being larger. Somehow the -r option
changes the eh_frame section, and the offsets in .rel.eh_frame exceed
the boundaries of the nosnap .eh_frame section.
Section Headers for nosnap:
[Nr] Name Type Addr Off Size ES
Flg Lk Inf Al
...
[ 6] .data PROGBITS 005f5240 1f5240 009e60 00
WA 0 0 16
[ 7] .eh_frame PROGBITS 005ff0a0 1ff0a0 0149ec 00
WA 0 0 4
...
Section Headers for rel:
[Nr] Name Type Addr Off Size ES
Flg Lk Inf Al
...
[10] .data PROGBITS 005f63a0 1f63a0 009e60 00
WA 0 0 16
[11] .rel.data REL 00000000 600d3c 00e810
08 44 a 4
[12] .eh_frame PROGBITS 00600200 200200 018900 00
WA 0 0 4
[13] .rel.eh_frame REL 00000000 60f54c 007698
08 44 c 4
...
Section Headers for final binary:
[Nr] Name Type Addr Off Size ES
Flg Lk Inf Al
...
[13] .data PROGBITS 001f5240 1f5240 009e60 00
WA 0 0 16
[14] .eh_frame PROGBITS 001ff0a0 1ff0a0 0149ec 00
WA 0 0 4
...
Notice the final binary's eh_frame size matches the nosnap, but not
the "rel" file. I should point out that this size changing occurred
between the nosnap and rel files under 3.3.6 as well, it's simply
that since the script ignored all of the eh_frame entries generated
under 3.3.6, it didn't matter.
So I'm not sure what's supposed to be done with the entries
in .eh_frame. Why does the size of the section change? Are the
relocation offsets even valid to look up in the "nosnap" file? (Why
problems only for .eh_frame and not the data section, which
apparently works fine) Is there something fundamentally broken with
what their script was trying to do?
I'd also like to point out that if I just ignore the .eh_frame
entries which fall outside the section's nosnap boundaries, the code
does load and starts to run! So it's really close if we can get
these relocation entries in order.
Thank you very much for your time! I'm happy to provide the full
scripts or any other information that might be helpful.
-Ethan