On 11/18/19 10:21 PM, Andrii Nakryiko wrote: > When relocating subprogram call, libbpf doesn't take into account > relo->text_off, which comes from symbol's value. This generally works fine for > subprograms implemented as static functions, but breaks for global functions. > > Taking a simplified test_pkt_access.c as an example: > > __attribute__ ((noinline)) > static int test_pkt_access_subprog1(volatile struct __sk_buff *skb) > { > return skb->len * 2; > } > > __attribute__ ((noinline)) > static int test_pkt_access_subprog2(int val, volatile struct __sk_buff *skb) > { > return skb->len + val; > } > > SEC("classifier/test_pkt_access") > int test_pkt_access(struct __sk_buff *skb) > { > if (test_pkt_access_subprog1(skb) != skb->len * 2) > return TC_ACT_SHOT; > if (test_pkt_access_subprog2(2, skb) != skb->len + 2) > return TC_ACT_SHOT; > return TC_ACT_UNSPEC; > } > > When compiled, we get two relocations, pointing to '.text' symbol. .text has > st_value set to 0 (it points to the beginning of .text section): > > 0000000000000008 000000050000000a R_BPF_64_32 0000000000000000 .text > 0000000000000040 000000050000000a R_BPF_64_32 0000000000000000 .text > > test_pkt_access_subprog1 and test_pkt_access_subprog2 offsets (targets of two > calls) are encoded within call instruction's imm32 part as -1 and 2, > respectively: > > 0000000000000000 test_pkt_access_subprog1: > 0: 61 10 00 00 00 00 00 00 r0 = *(u32 *)(r1 + 0) > 1: 64 00 00 00 01 00 00 00 w0 <<= 1 > 2: 95 00 00 00 00 00 00 00 exit > > 0000000000000018 test_pkt_access_subprog2: > 3: 61 10 00 00 00 00 00 00 r0 = *(u32 *)(r1 + 0) > 4: 04 00 00 00 02 00 00 00 w0 += 2 > 5: 95 00 00 00 00 00 00 00 exit > > 0000000000000000 test_pkt_access: > 0: bf 16 00 00 00 00 00 00 r6 = r1 > ===> 1: 85 10 00 00 ff ff ff ff call -1 > 2: bc 01 00 00 00 00 00 00 w1 = w0 > 3: b4 00 00 00 02 00 00 00 w0 = 2 > 4: 61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0) > 5: 64 02 00 00 01 00 00 00 w2 <<= 1 > 6: 5e 21 08 00 00 00 00 00 if w1 != w2 goto +8 <LBB0_3> > 7: bf 61 00 00 00 00 00 00 r1 = r6 > ===> 8: 85 10 00 00 02 00 00 00 call 2 > 9: bc 01 00 00 00 00 00 00 w1 = w0 > 10: 61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0) > 11: 04 02 00 00 02 00 00 00 w2 += 2 > 12: b4 00 00 00 ff ff ff ff w0 = -1 > 13: 1e 21 01 00 00 00 00 00 if w1 == w2 goto +1 <LBB0_3> > 14: b4 00 00 00 02 00 00 00 w0 = 2 > 0000000000000078 LBB0_3: > 15: 95 00 00 00 00 00 00 00 exit > > Now, if we compile example with global functions, the setup changes. > Relocations are now against specifically test_pkt_access_subprog1 and > test_pkt_access_subprog2 symbols, with test_pkt_access_subprog2 pointing 24 > bytes into its respective section (.text), i.e., 3 instructions in: > > 0000000000000008 000000070000000a R_BPF_64_32 0000000000000000 test_pkt_access_subprog1 > 0000000000000048 000000080000000a R_BPF_64_32 0000000000000018 test_pkt_access_subprog2 > > Calls instructions now encode offsets relative to function symbols and are both > set ot -1: > > 0000000000000000 test_pkt_access_subprog1: > 0: 61 10 00 00 00 00 00 00 r0 = *(u32 *)(r1 + 0) > 1: 64 00 00 00 01 00 00 00 w0 <<= 1 > 2: 95 00 00 00 00 00 00 00 exit > > 0000000000000018 test_pkt_access_subprog2: > 3: 61 20 00 00 00 00 00 00 r0 = *(u32 *)(r2 + 0) > 4: 0c 10 00 00 00 00 00 00 w0 += w1 > 5: 95 00 00 00 00 00 00 00 exit > > 0000000000000000 test_pkt_access: > 0: bf 16 00 00 00 00 00 00 r6 = r1 > ===> 1: 85 10 00 00 ff ff ff ff call -1 > 2: bc 01 00 00 00 00 00 00 w1 = w0 > 3: b4 00 00 00 02 00 00 00 w0 = 2 > 4: 61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0) > 5: 64 02 00 00 01 00 00 00 w2 <<= 1 > 6: 5e 21 09 00 00 00 00 00 if w1 != w2 goto +9 <LBB2_3> > 7: b4 01 00 00 02 00 00 00 w1 = 2 > 8: bf 62 00 00 00 00 00 00 r2 = r6 > ===> 9: 85 10 00 00 ff ff ff ff call -1 > 10: bc 01 00 00 00 00 00 00 w1 = w0 > 11: 61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0) > 12: 04 02 00 00 02 00 00 00 w2 += 2 > 13: b4 00 00 00 ff ff ff ff w0 = -1 > 14: 1e 21 01 00 00 00 00 00 if w1 == w2 goto +1 <LBB2_3> > 15: b4 00 00 00 02 00 00 00 w0 = 2 > 0000000000000080 LBB2_3: > 16: 95 00 00 00 00 00 00 00 exit > > Thus the right formula to calculate target call offset after relocation should > take into account relocation's target symbol value (offset within section), > call instruction's imm32 offset, and (subtracting, to get relative instruction > offset) instruction index of call instruction itself. All that is shifted by > number of instructions in main program, given all sub-programs are copied over > after main program. > > Convert test_pkt_access.c to global functions to verify this works. > > Reported-by: Alexei Starovoitov <ast@xxxxxxxxxx> > Signed-off-by: Andrii Nakryiko <andriin@xxxxxx> Acked-by: Yonghong Song <yhs@xxxxxx>