Yonghong Song wrote: > Currenty, a non-tracing bpf program typically has a single 'context' argument > with predefined uapi struct type. Following these uapi struct, user is able > to access other fields defined in uapi header. Inside the kernel, the > user-seen 'context' argument is replaced with 'kernel context' (or 'kctx' > in short) which can access more information than what uapi header provides. > To access other info not in uapi header, people typically do two things: > (1). extend uapi to access more fields rooted from 'context'. > (2). use bpf_probe_read_kernl() helper to read particular field based on > kctx. > Using (1) needs uapi change and using (2) makes code more complex since > direct memory access is not allowed. > > There are already a few instances trying to access more information from > kctx: > . trying to access some fields from perf_event kctx ([1]). > . trying to access some fields from xdp kctx ([2]). > > This patch set tried to allow direct memory access for kctx fields > by introducing bpf_cast_to_kern_ctx() kfunc. > > Martin mentioned a use case like type casting below: > #define skb_shinfo(SKB) ((struct skb_shared_info *)(skb_end_pointer(SKB))) > basically a 'unsigned char *" casted to 'struct skb_shared_info *'. This patch > set tries to support such a use case as well with bpf_rdonly_cast(). > > For the patch series, Patch 1 added support for a kfunc available to all > prog types. Patch 2 added bpf_cast_to_kern_ctx() kfunc. Patch 3 added > bpf_rdonly_cast() kfunc. Patch 4 added a few positive and negative tests. > > [1] https://lore.kernel.org/bpf/ad15b398-9069-4a0e-48cb-4bb651ec3088@xxxxxxxx/ > [2] https://lore.kernel.org/bpf/20221109215242.1279993-1-john.fastabend@xxxxxxxxx/ > > Changelog: > v3 -> v4: > - remove unnecessary bpf_ctx_convert.t error checking > - add and use meta.ret_btf_id instead of meta.arg_constant.value for > bpf_cast_to_kern_ctx(). > - add PTR_TRUSTED to the return PTR_TO_BTF_ID type for bpf_cast_to_kern_ctx(). > v2 -> v3: > - rebase on top of bpf-next (for merging conflicts) > - add the selftest to s390x deny list > rfcv1 -> v2: > - break original one kfunc into two. > - add missing error checks and error logs. > - adapt to the new conventions in > https://lore.kernel.org/all/20221118015614.2013203-1-memxor@xxxxxxxxx/ > for example, with __ign and __k suffix. > - added support in fixup_kfunc_call() to replace kfunc calls with a single mov. > > Yonghong Song (4): > bpf: Add support for kfunc set with common btf_ids > bpf: Add a kfunc to type cast from bpf uapi ctx to kernel ctx > bpf: Add a kfunc for generic type cast > bpf: Add type cast unit tests Thanks Yonghong! Ack for the series for me, but looks like Alexei is quick. >From myside this allows us to pull in the dev info and from that get netns so fixes a gap we had to split into a kprobe + xdp. If we can get a pointer to the recv queue then with a few reads we get the hash, vlan, etc. (see timestapm thread) And then last bit is if we can get a ptr to the net ns list, plus the rcu patch we can build the net ns iterator directly in BPF which seems stronger than an iterator IMO because we can kick it off on events anywhere in the kernel. Or based on event kick of some specific iterator e.g. walk net_devs in netns X with SR-IOV interfaces). Ideally we would also wire it up to timers so we can call it every N seconds without any user space intervention. Eventually, its nice if the user space can crash, restart, and so on without impacting the logic in kernel. Thanks again.