On Thu, Aug 22, 2024 at 10:38 AM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote: > > On 8/22/24 6:47 AM, Alexei Starovoitov wrote: > > On Wed, Aug 21, 2024 at 11:10 PM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote: > >> > >> On 8/21/24 6:32 PM, Alexei Starovoitov wrote: > >>> On Wed, Aug 21, 2024 at 4:35 PM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote: > >>>> > >>>> From: Martin KaFai Lau <martin.lau@xxxxxxxxxx> > >>>> > >>>> The existing prologue has been able to call bpf helper but not a kfunc. > >>>> This patch allows the prologue/epilogue to call the kfunc. > >>>> > >>>> The subsystem that implements the .gen_prologue and .gen_epilogue > >>>> can add the BPF_PSEUDO_KFUNC_CALL instruction with insn->imm > >>>> set to the btf func_id of the kfunc call. This part is the same > >>>> as the bpf prog loaded from the sys_bpf. > >>> > >>> I don't understand the value of this feature, since it seems > >>> pretty hard to use. > >>> The module (qdisc-bpf or else) would need to do something > >>> like patch 8/8: > >>> +BTF_ID_LIST(st_ops_epilogue_kfunc_list) > >>> +BTF_ID(func, bpf_kfunc_st_ops_inc10) > >>> +BTF_ID(func, bpf_kfunc_st_ops_inc100) > >>> > >>> just to be able to: > >>> BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_KFUNC_CALL, 0, > >>> st_ops_epilogue_kfunc_list[0]); > >>> > >>> So a bunch of extra work on the module side and > >>> a bunch of work in this patch to enable such a pattern, > >>> but what is the value? > >>> > >>> gen_epilogue() can call arbitrary kernel function. > >>> It doesn't have to be a helper. > >>> kfunc-s provide calling convention conversion from bpf to native, > >>> but the same thing is achieved by BPF_CALL_N macro. > >>> The module can use that macro without adding an actual bpf helper > >>> to uapi bpf.h. > >>> Then in gen_epilogue() the extra bpf insn can use: > >>> BPF_EMIT_CALL(module_provided_helper_that_is_not_helper) > >>> which will use > >>> BPF_CALL_IMM(x) ((void *)(x) - (void *)__bpf_call_base) > >> > >> BPF_EMIT_CALL() was my earlier thought. I switched to the kfunc in this patch > >> because of the bpf_jit_supports_far_kfunc_call() support for the kernel module. > >> Using kfunc call will make supporting it the same. > > > > I believe far calls are typically slower, > > so it may be a foot gun. > > If something like qdisc-bpf adding a function call to bpf_exit > > it will be called every time the program is called, so > > it needs to be really fast. > > Allowing such callable funcs in modules may be a performance issue > > that we'd need to fix. > > So imo making a design requirement that such funcs for gen_epilogoue() > > need to be in kernel text is a good thing. > > Agreed. Make sense. > > > > >> I think the future bpf-qdisc can enforce built-in. bpf-tcp-cc has already been > >> built-in only also. I think the hid_bpf is built-in only also. > > > > I don't think hid_bpf has any need for such gen_epilogue() adjustment. > > tcp-bpf-cc probably doesn't need it either. > > it's cleaner to fix up on the kernel side, no? > > tcp-bpf-cc can use it to fix snd_cwnd. We have seen a mistake that snd_cwnd was > set to 0 (or negative, can't remember which one). >1 ops of the > tcp_congestion_ops may update the snd_cwnd, so there will be multiple places it > needs to do an extra check/fix in the kernel. It is usually not the fast path, > so may be ok. > > It is not catastrophic as skb->dev. kfunc was not introduced at that time also. > Otherwise, having a kfunc to set the snd_cwnd instead could have been an option. > > > qdisc-bpf and ->dev stuff is probably the only upcoming user. > > For skb->dev, may be having a dedicated kfuncs for skb->dev manipulation is the > way to go? The example could be operations that need to touch the > skb->rbnode/dev sharing pointer. > > For fixing ->dev in the kernel, there are multiple places doing ->dequeue and > not sure if we need to include the child->dequeue also. This fixing could be > refactored to a kernel function and probably need to a static key in this fast > path case. > > > And that's a separate discussion. I'm not sure such gen_epilogoue() > > concept is really that great. > > Especially considering all the complexity involved. > > I am curious on the problem you pointed out at patch 1 regardless, I am going to > give it a try and remove the kfunc call. I made kfunc call separated at patch 7 > and 8 :) > > If it still looks too complex or there is no value on gen_epilogue, I am fine to > table this set. I think the patches 1-6 are fine and good to go. Mainly because they simplify landing of qdisc-bpf. Once all pieces are there we may revisit the need for gen_epilogoue() and whether there is an alternative. I would only drop 7 and 8 for now until it's absolutely needed.