On Tue, 2021-04-20 at 08:21 -0500, Peng Yu via Gcc-help wrote: > On 4/20/21, Jonathan Wakely <jwakely.gcc@xxxxxxxxx> wrote: > > On Tue, 20 Apr 2021 at 06:42, Peng Yu wrote: > > > > > > Hi, > > > > > > I am trying to understand where are unnamed string constants are > > > defined in symbol tables. > > > > > > In the following example, the unnamed string constant is "Hello > > > World!". > > > > > > $ cat a.c > > > #include <stdio.h> > > > int main() { puts("Hello World!"); } > > > $ gcc -c a.c > > > $ gcc -o a.out a.o > > > > > > But I don't find it in the symbol table. Could anybody point to me > > > where it is? Thanks. > > > > The ELF format puts string literals in the .rodata section, not the > > .symtab. > > > > > > $ readelf -p .rodata a.out > > > > String dump of section '.rodata': > > [ 10] Hello World! > > How does the linker know that it should look for the string literal in > .rodata just using the object file? > > $ objdump --disassemble=main a.o > ... > 0000000000000000 <main>: > 0: 55 push %rbp > 1: 48 89 e5 mov %rsp,%rbp > 4: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # b > <main+0xb> > b: e8 00 00 00 00 callq 10 <main+0x10> > 10: b8 00 00 00 00 mov $0x0,%eax > 15: 5d pop %rbp > 16: c3 retq > > Where is the number "10" in the callq line from? "00 00 00 00" is just > address 0. So the disassembler knows "00 00 00 00" is not a legal > address, so it just put the address of the next instruction which is > "10"? They are in relocation records. Try objdump -r. > Is callq just call in the disassembled code? I see people says they > are the same. But I don't understand if they are the same why not just > use call instead? "q" means "qword", i. e. it's a 64-bit operation. On my system objdump (binutils-2.36.1) just print "call" instead of "callq", because there is no "calll" ("32-bit call") operation in a 64-bit x86 program. > $ objdump --disassemble=main a.out > ... > 0000000000001135 <main>: > 1135: 55 push %rbp > 1136: 48 89 e5 mov %rsp,%rbp > 1139: 48 8d 3d c4 0e 00 00 lea 0xec4(%rip),%rdi > # 2004 > <_IO_stdin_used+0x4> > 1140: e8 eb fe ff ff callq 1030 <puts@plt> > 1145: b8 00 00 00 00 mov $0x0,%eax > 114a: 5d pop %rbp > 114b: c3 retq > ... > > How does objdump figure out 0xec4(%rip) is the address > _IO_stdin_used+0x4? 0xec4(%rip) means "the value of %rip (the program counter) plus 0xec4", i. e. (the location of <main>) + (the offset of the next instruction in <main>) + 0xec4. It's somewhere in .rodata, and the nearest symbol before it is _IO_stdin_used. Then just a substraction can get 0x4. > 0xec4 is related to the relative layout of .rodata and .text. Is there > a way to show both sections in the same layout as they appear in > memory (data should be shown as hexdump, and code should be shown as > disassembled code) for human readability? None that I'm conscious of. Anyway this thread is very off-topic now. gcc-help is not for discussion about general questions about linking and relocation. stackoverflow.com may be a better place. -- Xi Ruoyao <xry111@xxxxxxxxxxxxxxxx> School of Aerospace Science and Technology, Xidian University