> Am 08.07.2019 um 18:13 schrieb Stanislav Fomichev <sdf@xxxxxxxxxxx>: > > On 07/03, Y Song wrote: >> On Wed, Jul 3, 2019 at 1:51 PM Stanislav Fomichev <sdf@xxxxxxxxxx> wrote: >>> >>> Take the first x bytes of pt_regs for scalability tests, there is >>> no real reason we need x86 specific rax. >>> >>> Signed-off-by: Stanislav Fomichev <sdf@xxxxxxxxxx> >>> --- >>> tools/testing/selftests/bpf/progs/loop1.c | 3 ++- >>> tools/testing/selftests/bpf/progs/loop2.c | 3 ++- >>> tools/testing/selftests/bpf/progs/loop3.c | 3 ++- >>> 3 files changed, 6 insertions(+), 3 deletions(-) >>> >>> diff --git a/tools/testing/selftests/bpf/progs/loop1.c b/tools/testing/selftests/bpf/progs/loop1.c >>> index dea395af9ea9..d530c61d2517 100644 >>> --- a/tools/testing/selftests/bpf/progs/loop1.c >>> +++ b/tools/testing/selftests/bpf/progs/loop1.c >>> @@ -14,11 +14,12 @@ SEC("raw_tracepoint/kfree_skb") >>> int nested_loops(volatile struct pt_regs* ctx) >>> { >>> int i, j, sum = 0, m; >>> + volatile int *any_reg = (volatile int *)ctx; >>> >>> for (j = 0; j < 300; j++) >>> for (i = 0; i < j; i++) { >>> if (j & 1) >>> - m = ctx->rax; >>> + m = *any_reg; >> >> I agree. ctx->rax here is only to generate some operations, which >> cannot be optimized away by the compiler. dereferencing a volatile >> pointee may just serve that purpose. >> >> Comparing the byte code generated with ctx->rax and *any_reg, they are >> slightly different. Using *any_reg is slighly worse, but this should >> be still okay for the test. >> >>> else >>> m = j; >>> sum += i * m; >>> diff --git a/tools/testing/selftests/bpf/progs/loop2.c b/tools/testing/selftests/bpf/progs/loop2.c >>> index 0637bd8e8bcf..91bb89d901e3 100644 >>> --- a/tools/testing/selftests/bpf/progs/loop2.c >>> +++ b/tools/testing/selftests/bpf/progs/loop2.c >>> @@ -14,9 +14,10 @@ SEC("raw_tracepoint/consume_skb") >>> int while_true(volatile struct pt_regs* ctx) >>> { >>> int i = 0; >>> + volatile int *any_reg = (volatile int *)ctx; >>> >>> while (true) { >>> - if (ctx->rax & 1) >>> + if (*any_reg & 1) >>> i += 3; >>> else >>> i += 7; >>> diff --git a/tools/testing/selftests/bpf/progs/loop3.c b/tools/testing/selftests/bpf/progs/loop3.c >>> index 30a0f6cba080..3a7f12d7186c 100644 >>> --- a/tools/testing/selftests/bpf/progs/loop3.c >>> +++ b/tools/testing/selftests/bpf/progs/loop3.c >>> @@ -14,9 +14,10 @@ SEC("raw_tracepoint/consume_skb") >>> int while_true(volatile struct pt_regs* ctx) >>> { >>> __u64 i = 0, sum = 0; >>> + volatile __u64 *any_reg = (volatile __u64 *)ctx; >>> do { >>> i++; >>> - sum += ctx->rax; >>> + sum += *any_reg; >>> } while (i < 0x100000000ULL); >>> return sum; >>> } >>> -- >>> 2.22.0.410.gd8fdbe21b5-goog >> >> Ilya Leoshkevich (iii@xxxxxxxxxxxxx, cc'ed) has another patch set >> trying to solve this problem by introducing s360 arch register access >> macros. I guess for now that patch set is not needed any more? > Oh, I missed them. Do they fix the tests for other (non-s360) arches as > well? I was trying to fix the issue by not depending on any arch > specific stuff because the test really doesn't care :-) They are supposed to work for everything that defines PT_REGS_RC in bpf_helpers.h, but I have to admit I tested only x86_64 and s390. The main source of problems with my approach were mismatching definitions of struct pt_regs for userspace and kernel, and because of that there was some tweaking required for both arches. I will double check how it looks for others (arm, mips, ppc, sparc) tomorrow. Best regards, Ilya