On Mon, Feb 10, 2025 at 12:13 PM Cupertino Miranda <cupertino.miranda@xxxxxxxxxx> wrote: > > Hi Andrii, > > On 07-02-2025 01:48, Andrii Nakryiko wrote: > > Add a simple repro for the issue of miscalculating LDX/STX/ST CO-RE > > relocation size adjustment when the CO-RE relocation target type is an > > ARRAY. > > > > We need to make sure that compiler generates LDX/STX/ST instruction with > > CO-RE relocation against entire ARRAY type, not ARRAY's element. With > > the code pattern in selftest, we get this: > > > > 59: 61 71 00 00 00 00 00 00 w1 = *(u32 *)(r7 + 0x0) > > 00000000000001d8: CO-RE <byte_off> [5] struct core_reloc_arrays::a (0:0) > > > > Where offset of `int a[5]` is embedded (through CO-RE relocation) into memory > > load instruction itself. > > > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > --- > > tools/testing/selftests/bpf/prog_tests/core_reloc.c | 6 ++++-- > > ...f__core_reloc_arrays___err_bad_signed_arr_elem_sz.c | 3 +++ > > tools/testing/selftests/bpf/progs/core_reloc_types.h | 10 ++++++++++ > > .../selftests/bpf/progs/test_core_reloc_arrays.c | 5 +++++ > > 4 files changed, 22 insertions(+), 2 deletions(-) > > create mode 100644 tools/testing/selftests/bpf/progs/btf__core_reloc_arrays___err_bad_signed_arr_elem_sz.c > > > > diff --git a/tools/testing/selftests/bpf/prog_tests/core_reloc.c b/tools/testing/selftests/bpf/prog_tests/core_reloc.c > > index e10ea92c3fe2..08963c82f30b 100644 > > --- a/tools/testing/selftests/bpf/prog_tests/core_reloc.c > > +++ b/tools/testing/selftests/bpf/prog_tests/core_reloc.c > > @@ -85,11 +85,11 @@ static int duration = 0; > > #define NESTING_ERR_CASE(name) { \ > > NESTING_CASE_COMMON(name), \ > > .fails = true, \ > > - .run_btfgen_fails = true, \ > > + .run_btfgen_fails = true, \ > > } > > > > #define ARRAYS_DATA(struct_name) STRUCT_TO_CHAR_PTR(struct_name) { \ > > - .a = { [2] = 1 }, \ > > + .a = { [2] = 1, [3] = 11 }, \ > > .b = { [1] = { [2] = { [3] = 2 } } }, \ > > .c = { [1] = { .c = 3 } }, \ > > .d = { [0] = { [0] = { .d = 4 } } }, \ > > @@ -108,6 +108,7 @@ static int duration = 0; > > .input_len = sizeof(struct core_reloc_##name), \ > > .output = STRUCT_TO_CHAR_PTR(core_reloc_arrays_output) { \ > > .a2 = 1, \ > > + .a3 = 12, \ > > .b123 = 2, \ > > .c1c = 3, \ > > .d00d = 4, \ > > @@ -602,6 +603,7 @@ static const struct core_reloc_test_case test_cases[] = { > > ARRAYS_ERR_CASE(arrays___err_non_array), > > ARRAYS_ERR_CASE(arrays___err_wrong_val_type), > > ARRAYS_ERR_CASE(arrays___err_bad_zero_sz_arr), > > + ARRAYS_ERR_CASE(arrays___err_bad_signed_arr_elem_sz), > > > > /* enum/ptr/int handling scenarios */ > > PRIMITIVES_CASE(primitives), > > diff --git a/tools/testing/selftests/bpf/progs/btf__core_reloc_arrays___err_bad_signed_arr_elem_sz.c b/tools/testing/selftests/bpf/progs/btf__core_reloc_arrays___err_bad_signed_arr_elem_sz.c > > new file mode 100644 > > index 000000000000..21a560427b10 > > --- /dev/null > > +++ b/tools/testing/selftests/bpf/progs/btf__core_reloc_arrays___err_bad_signed_arr_elem_sz.c > > @@ -0,0 +1,3 @@ > > +#include "core_reloc_types.h" > > + > > +void f(struct core_reloc_arrays___err_bad_signed_arr_elem_sz x) {} > > diff --git a/tools/testing/selftests/bpf/progs/core_reloc_types.h b/tools/testing/selftests/bpf/progs/core_reloc_types.h > > index fd8e1b4c6762..5760ae015e09 100644 > > --- a/tools/testing/selftests/bpf/progs/core_reloc_types.h > > +++ b/tools/testing/selftests/bpf/progs/core_reloc_types.h > > @@ -347,6 +347,7 @@ struct core_reloc_nesting___err_too_deep { > > */ > > struct core_reloc_arrays_output { > > int a2; > > + int a3; > > char b123; > > int c1c; > > int d00d; > > @@ -455,6 +456,15 @@ struct core_reloc_arrays___err_bad_zero_sz_arr { > > struct core_reloc_arrays_substruct d[1][2]; > > }; > > > > +struct core_reloc_arrays___err_bad_signed_arr_elem_sz { > > + /* int -> short (signed!): not supported case */ > > + short a[5]; > > + char b[2][3][4]; > > + struct core_reloc_arrays_substruct c[3]; > > + struct core_reloc_arrays_substruct d[1][2]; > > + struct core_reloc_arrays_substruct f[][2]; > > +}; > > + > > /* > > * PRIMITIVES > > */ > > diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_arrays.c b/tools/testing/selftests/bpf/progs/test_core_reloc_arrays.c > > index 51b3f79df523..448403634eea 100644 > > --- a/tools/testing/selftests/bpf/progs/test_core_reloc_arrays.c > > +++ b/tools/testing/selftests/bpf/progs/test_core_reloc_arrays.c > > @@ -15,6 +15,7 @@ struct { > > > > struct core_reloc_arrays_output { > > int a2; > > + int a3; > > char b123; > > int c1c; > > int d00d; > > @@ -41,6 +42,7 @@ int test_core_arrays(void *ctx) > > { > > struct core_reloc_arrays *in = (void *)&data.in; > > struct core_reloc_arrays_output *out = (void *)&data.out; > > + int *a; > > > > if (CORE_READ(&out->a2, &in->a[2])) > > return 1; > > @@ -53,6 +55,9 @@ int test_core_arrays(void *ctx) > > if (CORE_READ(&out->f01c, &in->f[0][1].c)) > > return 1; > > > > + a = __builtin_preserve_access_index(({ in->a; })); > > + out->a3 = a[0] + a[1] + a[2] + a[3]; > Just to try to understand what seems to be the expectation from the > compiler and CO-RE in this case. > Do you expect that all those a[n] accesses would be generating CO-RE > relocations assuming the size for the elements in in->a ? > Well, I only care to get LDX instruction with associated in->a CO-RE relocation. This is what Clang currently generates for this piece of code. You can see that it combines both LDX+CO-RE relo for a[0], and then non-CO-RE relocated LDX for a[1], a[2], a[3], where the base was relocated with CO-RE a bit earlier. 44: 18 07 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r7 = 0x0 ll 0000000000000160: R_BPF_64_64 data ... 55: b7 01 00 00 00 00 00 00 r1 = 0x0 00000000000001b8: CO-RE <byte_off> [5] struct core_reloc_arrays::a (0:0) 56: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0x0 ll 00000000000001c0: R_BPF_64_64 data 58: 0f 12 00 00 00 00 00 00 r2 += r1 59: 61 71 00 00 00 00 00 00 w1 = *(u32 *)(r7 + 0x0) 00000000000001d8: CO-RE <byte_off> [5] struct core_reloc_arrays::a (0:0) 60: 61 23 04 00 00 00 00 00 w3 = *(u32 *)(r2 + 0x4) 61: 0c 13 00 00 00 00 00 00 w3 += w1 62: 61 21 08 00 00 00 00 00 w1 = *(u32 *)(r2 + 0x8) 63: 0c 13 00 00 00 00 00 00 w3 += w1 64: 61 21 0c 00 00 00 00 00 w1 = *(u32 *)(r2 + 0xc) 65: 0c 13 00 00 00 00 00 00 w3 += w1 66: 63 37 04 01 00 00 00 00 *(u32 *)(r7 + 0x104) = w3 Clang might change code generation pattern in the future, of course, but at least as of right now I know I did test this logic :) Ideally I'd be able to generate embedded asm with CO-RE relocation, but I'm not sure that's supported today. > > + > > return 0; > > } > > >