> On Mar 29, 2022, at 11:43 PM, Song Liu <songliubraving@xxxxxx> wrote: > > > >> On Mar 29, 2022, at 7:47 PM, Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: >> >> On Tue, Mar 29, 2022 at 5:39 PM Song Liu <songliubraving@xxxxxx> wrote: >>> >>> >>> >>>> On Mar 29, 2022, at 5:00 PM, Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: >>>> >>>> On Tue, Mar 29, 2022 at 4:19 PM Song Liu <song@xxxxxxxxxx> wrote: >>>>> >>>>> TP_PROTO of sched_switch is updated with a new arg prev_state, which >>>>> causes runqslower load failure: >>>>> >>>>> libbpf: prog 'handle__sched_switch': BPF program load failed: Permission denied >>>>> libbpf: prog 'handle__sched_switch': -- BEGIN PROG LOAD LOG -- >>>>> R1 type=ctx expected=fp >>>>> 0: R1=ctx(off=0,imm=0) R10=fp0 >>>>> ; int handle__sched_switch(u64 *ctx) >>>>> 0: (bf) r7 = r1 ; R1=ctx(off=0,imm=0) R7_w=ctx(off=0,imm=0) >>>>> ; struct task_struct *next = (struct task_struct *)ctx[2]; >>>>> 1: (79) r6 = *(u64 *)(r7 +16) >>>>> func 'sched_switch' arg2 has btf_id 186 type STRUCT 'task_struct' >>>>> 2: R6_w=ptr_task_struct(off=0,imm=0) R7_w=ctx(off=0,imm=0) >>>>> ; struct task_struct *prev = (struct task_struct *)ctx[1]; >>>>> 2: (79) r2 = *(u64 *)(r7 +8) ; R2_w=scalar() R7_w=ctx(off=0,imm=0) >>>>> 3: (b7) r1 = 0 ; R1_w=0 >>>>> ; struct runq_event event = {}; >>>>> 4: (7b) *(u64 *)(r10 -8) = r1 ; R1_w=P0 R10=fp0 fp-8_w=00000000 >>>>> 5: (7b) *(u64 *)(r10 -16) = r1 ; R1_w=P0 R10=fp0 fp-16_w=00000000 >>>>> 6: (7b) *(u64 *)(r10 -24) = r1 ; R1_w=P0 R10=fp0 fp-24_w=00000000 >>>>> 7: (7b) *(u64 *)(r10 -32) = r1 ; R1_w=P0 R10=fp0 fp-32_w=00000000 >>>>> ; if (prev->__state == TASK_RUNNING) >>>>> 8: (61) r1 = *(u32 *)(r2 +24) >>>>> R2 invalid mem access 'scalar' >>>>> processed 9 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 >>>>> -- END PROG LOAD LOG -- >>>>> libbpf: failed to load program 'handle__sched_switch' >>>>> libbpf: failed to load object 'runqslower_bpf' >>>>> libbpf: failed to load BPF skeleton 'runqslower_bpf': -13 >>>>> failed to load BPF object: -13 >>>>> >>>>> Update runqslower to fix this issue. Also, as we are on this, use BPF_PROG >>>>> in runqslower for cleaner code. >>>>> >>>>> Fixes: fa2c3254d7cf ("sched/tracing: Don't re-read p->state when emitting sched_switch event") >>>>> Signed-off-by: Song Liu <song@xxxxxxxxxx> >>>>> --- >>>>> tools/bpf/runqslower/runqslower.bpf.c | 19 +++++-------------- >>>>> 1 file changed, 5 insertions(+), 14 deletions(-) >>>>> >>>> >>>> It would be much less disruptive if that prev_state was added after >>>> "next", but oh well... >>> >>> Maybe we should change that. >>> >>> +Valentin and Steven, how about we change the order with the attached >>> diff (not the original patch in this thread, but the one at the end of >>> this email)? >>> >>>> >>>> But anyways, let's handle this in a way that can handle both old >>>> kernels and new ones and do the same change in libbpf-tool's >>>> runqslower ([0]). Can you please follow up there as well? >>> >>> Yeah, I will also fix that one. >> >> Thanks! >> >>> >>>> >>>> >>>> We can use BPF CO-RE to detect which order of arguments running kernel >>>> has by checking prev_state field existence in struct >>>> trace_event_raw_sched_switch. Can you please try that? Use >>>> bpf_core_field_exists() for that. >>> >>> Do you mean something like >>> >>> if (bpf_core_field_exists(ctx->prev_state)) >>> /* use ctx[2] and ctx[3] */ >>> else >>> /* use ctx[1] and ctx[2] */ >> >> yep, that's what I meant, except you don't have ctx->prev_state, you have to do: >> >> if (bpf_core_field_exists(((struct trace_event_raw_sched_switch >> *)0)->prev_state)) Actually, struct trace_event_raw_sched_switch is not the arguments we have for tp_btf. For both older and newer kernel, it is the same: struct trace_event_raw_sched_switch { struct trace_entry ent; char prev_comm[16]; pid_t prev_pid; int prev_prio; long int prev_state; char next_comm[16]; pid_t next_pid; int next_prio; char __data[0]; }; So I guess this check won't work? Thanks, Song