Re: How to lookup the unnamed string constant in the generated object or executable?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2021-04-20 at 10:07 -0500, Peng Yu via Gcc-help wrote:
> On 4/20/21, Stefan Ring <stefanrin@xxxxxxxxx> wrote:
> > On Tue, Apr 20, 2021 at 4:27 PM Peng Yu via Gcc-help
> > <gcc-help@xxxxxxxxxxx> wrote:
> > > 
> > > How does the linker know that it should look for the string literal
> > > in
> > > .rodata just using the object file?
> > 
> > You should dump the objects file's relocations, then you'll
> > understand.
> 
> Here is the result of relocations. I don't understand how it can be
> used to resolve my question. Could you please clarify?

Then you should just find and read a textbook about linking, instead of
posting the off-topic question in a mail list.  The linker is not a part
of GCC, the disassembler neither.

> $ readelf -r a.o
> 
> Relocation section '.rela.text' at offset 0x210 contains 2 entries:
>   Offset          Info           Type           Sym. Value    Sym.
> Name + Addend
> 000000000007  000500000002 R_X86_64_PC32     0000000000000000 .rodata
> - 4
> 00000000000c  000b00000004 R_X86_64_PLT32    0000000000000000 puts - 4
> 
> Relocation section '.rela.eh_frame' at offset 0x240 contains 1 entry:
>   Offset          Info           Type           Sym. Value    Sym.
> Name + Addend
> 000000000020  000200000002 R_X86_64_PC32     0000000000000000 .text +
> 0
>
> > > $ objdump --disassemble=main a.o
> > > ...
> > > 0000000000000000 <main>:
> > >    0:   55                      push   %rbp
> > >    1:   48 89 e5                mov    %rsp,%rbp
> > >    4:   48 8d 3d 00 00 00 00    lea    0x0(%rip),%rdi        # b
> > > <main+0xb>
> > >    b:   e8 00 00 00 00          callq  10 <main+0x10>
> > >   10:   b8 00 00 00 00          mov    $0x0,%eax
> > >   15:   5d                      pop    %rbp
> > >   16:   c3                      retq
> > > 
> > > Where is the number "10" in the callq line from? "00 00 00 00" is
> > > just
> > > address 0. So the disassembler knows "00 00 00 00" is not a legal
> > > address, so it just put the address of the next instruction which
> > > is
> > > "10"?
> > 
> > The displacement in the call instruction is 0, which is relative to
> > the address after the current instruction, which is 10. So the
> > disassembler displays it as a call to address 10.
> 
> How do I know it is a relative call? I don't even find the callq
> instruction in the intel assemly code manual. Where is mnemonics callq
> from?

"q" just means "64-bit mode".

You can see the hexdump is E8 00 00 00 00.  It's clearly documented at
page 3-122:

> E8 cd:
> 
> CALL rel32
>
> near, relative, displacement relative to next
> instruction. 32-bit displacement sign extended to
> 64-bits in 64-bit mode.

> https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf
> 
> > > $ objdump --disassemble=main a.out
> > > ...
> > > 0000000000001135 <main>:
> > >     1135:       55                      push   %rbp
> > >     1136:       48 89 e5                mov    %rsp,%rbp
> > >     1139:       48 8d 3d c4 0e 00 00    lea   
> > > 0xec4(%rip),%rdi        #
> > > 2004
> > > <_IO_stdin_used+0x4>
> > >     1140:       e8 eb fe ff ff          callq  1030 <puts@plt>
> > >     1145:       b8 00 00 00 00          mov    $0x0,%eax
> > >     114a:       5d                      pop    %rbp
> > >     114b:       c3                      retq
> > > ...
> > > 
> > > How does objdump figure out 0xec4(%rip) is the address
> > > _IO_stdin_used+0x4?
> > 
> > I guess it just picks the closest symbol. It's basically meaningless
> > in this case.
> 
> Are you sure it is nonsense? If it is nonsense, why is it generated?
> _IO_stdin_used is clearly a symbol in the .rodata section.
> 
> $ readelf -s a.out | grep _IO_stdin_used
>     55: 0000000000002000     4 OBJECT  GLOBAL DEFAULT   16
> _IO_stdin_used
> $ readelf -SW a.out | grep '\[16\]'
>   [16] .rodata           PROGBITS        0000000000002000 002000
> 000011 00   A  0   0  4
> 

In a specific binary it may make some sense: +0x4 is just calculated as
a offset related to the nearest symbol ("_IO_stdin_used" here). 
Generally it's nonsense: the compiler and linker may reorder objects,
there is no way to predict the location of an object (unless using some
tricky linker scripts).
-- 
Xi Ruoyao <xry111@xxxxxxxxxxxxxxxx>
School of Aerospace Science and Technology, Xidian University




[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux