On 03/01/2019 07:11 PM, Yonghong Song wrote: > On 2/28/19 3:18 PM, Daniel Borkmann wrote: [...] >> @@ -1412,6 +1568,24 @@ bpf_program__relocate(struct bpf_program *prog, struct bpf_object *obj) >> &prog->reloc_desc[i]); >> if (err) >> return err; >> + } else if (prog->reloc_desc[i].type == RELO_DATA || >> + prog->reloc_desc[i].type == RELO_RODATA || >> + prog->reloc_desc[i].type == RELO_BSS) { >> + struct bpf_insn *insns = prog->insns; >> + int insn_idx, map_idx, data_off; >> + >> + insn_idx = prog->reloc_desc[i].insn_idx; >> + map_idx = prog->reloc_desc[i].map_idx; >> + data_off = insns[insn_idx].imm; > > I want to point to a subtle difference here between handling pure global > variables and static global variables. The "imm" value is only available > for static variables. For example, > > -bash-4.4$ cat g.c > static volatile long sg = 2; > static volatile int si = 3; > long g = 4; > int i = 5; > int test() { return sg + si + g + i; } > -bash-4.4$ > -bash-4.4$ clang -target bpf -O2 -c g.c > > -bash-4.4$ readelf -s g.o > > > Symbol table '.symtab' contains 8 entries: > Num: Value Size Type Bind Vis Ndx Name > 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND > 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS g.c > 2: 0000000000000010 8 OBJECT LOCAL DEFAULT 4 sg > 3: 0000000000000018 4 OBJECT LOCAL DEFAULT 4 si > 4: 0000000000000000 0 SECTION LOCAL DEFAULT 4 > 5: 0000000000000000 8 OBJECT GLOBAL DEFAULT 4 g > 6: 0000000000000008 4 OBJECT GLOBAL DEFAULT 4 i > 7: 0000000000000000 128 FUNC GLOBAL DEFAULT 2 test > -bash-4.4$ > -bash-4.4$ llvm-readelf -r g.o > > Relocation section '.rel.text' at offset 0x1d8 contains 4 entries: > Offset Info Type Symbol's > Value Symbol's Name > 0000000000000000 0000000400000001 R_BPF_64_64 > 0000000000000000 .data > 0000000000000018 0000000400000001 R_BPF_64_64 > 0000000000000000 .data > 0000000000000038 0000000500000001 R_BPF_64_64 0000000000000000 g > 0000000000000058 0000000600000001 R_BPF_64_64 0000000000000008 i > -bash-4.4$ llvm-objdump -d g.o > > g.o: file format ELF64-BPF > > Disassembly of section .text: > 0000000000000000 test: > 0: 18 01 00 00 10 00 00 00 00 00 00 00 00 00 00 00 > r1 = 16 ll > 2: 79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0) > 3: 18 02 00 00 18 00 00 00 00 00 00 00 00 00 00 00 > r2 = 24 ll > 5: 61 22 00 00 00 00 00 00 r2 = *(u32 *)(r2 + 0) > 6: 0f 21 00 00 00 00 00 00 r1 += r2 > 7: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > r2 = 0 ll > 9: 79 22 00 00 00 00 00 00 r2 = *(u64 *)(r2 + 0) > 10: 0f 21 00 00 00 00 00 00 r1 += r2 > 11: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > r2 = 0 ll > 13: 61 20 00 00 00 00 00 00 r0 = *(u32 *)(r2 + 0) > 14: 0f 10 00 00 00 00 00 00 r0 += r1 > 15: 95 00 00 00 00 00 00 00 exit > -bash-4.4$ > > You can see the above, the non-static global access does not have its > in-section offset encoded in the insn itself. The difference is due to > llvm treating static global and non-static global differently. > > To support both cases, during relocation recording stage, you can > also record: > . symbol binding (GELF_ST_BIND(sym.st_info)), > non-static global has binding STB_GLOBAL and static > global has binding STB_LOCAL > . symbol value (sym.st_value) > > During the above relocation resolution, if symbol bind is local, do > what you already did here. If symbol bind is global, assign data_off > with symbol value. > > This applied to both .data and .rodata sections. > > The non initialized > global variable will not be in any allocated section in ELF file, > it is in a COM section which is to be allocated by loader. > So user defines some like > int g; > and later on uses it. Right now, it will not work. The workaround > is "int g = 4", or "static int g". I guess it should be > okay, we should encourage users to use "static" variables instead. Agree and noted, and thanks for pointing this out, Yonghong! I'll fix this up accordingly in next round. Thanks a lot, Daniel