(sent again in plain text, sorry for the noise). Hi Martin. On Tue, Nov 29, 2022 at 3:58 AM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote: > > On 11/28/22 8:05 AM, Eyal Birger wrote: > > This change adds xfrm metadata helpers using the unstable kfunc call > > interface for the TC-BPF hooks. This allows steering traffic towards > > different IPsec connections based on logic implemented in bpf programs. > > > > This object is built based on the availabilty of BTF debug info. > > > > The metadata percpu dsts used on TX take ownership of the original skb > > dsts so that they may be used as part of the xfrm transmittion logic - > > e.g. for MTU calculations. > > A few quick comments and questions: Thanks for your comments! > > > > > Signed-off-by: Eyal Birger <eyal.birger@xxxxxxxxx> > > --- > > include/net/dst_metadata.h | 1 + > > include/net/xfrm.h | 20 ++++++++ > > net/core/dst.c | 4 ++ > > net/xfrm/Makefile | 6 +++ > > net/xfrm/xfrm_interface_bpf.c | 92 ++++++++++++++++++++++++++++++++++ > > Please tag for bpf-next Sure. I wasn't totally sure which tree this belongs to. > > > net/xfrm/xfrm_interface_core.c | 15 ++++++ > > 6 files changed, 138 insertions(+) > > create mode 100644 net/xfrm/xfrm_interface_bpf.c > > > > diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h > > index a454cf4327fe..1b7fae4c6b24 100644 > > --- a/include/net/dst_metadata.h > > +++ b/include/net/dst_metadata.h > > @@ -26,6 +26,7 @@ struct macsec_info { > > struct xfrm_md_info { > > u32 if_id; > > int link; > > + struct dst_entry *dst_orig; > > }; > > > > struct metadata_dst { > > [ ... ] > > > diff --git a/net/core/dst.c b/net/core/dst.c > > index bc9c9be4e080..4c2eb7e56dab 100644 > > --- a/net/core/dst.c > > +++ b/net/core/dst.c > > @@ -315,6 +315,8 @@ void metadata_dst_free(struct metadata_dst *md_dst) > > #ifdef CONFIG_DST_CACHE > > if (md_dst->type == METADATA_IP_TUNNEL) > > dst_cache_destroy(&md_dst->u.tun_info.dst_cache); > > + else if (md_dst->type == METADATA_XFRM) > > + dst_release(md_dst->u.xfrm_info.dst_orig); > > Why only release dst_orig under CONFIG_DST_CACHE? It's a relic from a previous version where I'd used dst cache. Will move out of this ifdef. > > > #endif > > kfree(md_dst); > > } > > @@ -348,6 +350,8 @@ void metadata_dst_free_percpu(struct metadata_dst __percpu *md_dst) > > > > if (one_md_dst->type == METADATA_IP_TUNNEL) > > dst_cache_destroy(&one_md_dst->u.tun_info.dst_cache); > > + else if (one_md_dst->type == METADATA_XFRM) > > + dst_release(one_md_dst->u.xfrm_info.dst_orig); > > Same here. Likewise. > > [ ... ] > > > diff --git a/net/xfrm/xfrm_interface_bpf.c b/net/xfrm/xfrm_interface_bpf.c > > new file mode 100644 > > index 000000000000..d3997ab7cc28 > > --- /dev/null > > +++ b/net/xfrm/xfrm_interface_bpf.c > > @@ -0,0 +1,92 @@ > > +// SPDX-License-Identifier: GPL-2.0-only > > +/* Unstable XFRM Helpers for TC-BPF hook > > + * > > + * These are called from SCHED_CLS BPF programs. Note that it is > > + * allowed to break compatibility for these functions since the interface they > > + * are exposed through to BPF programs is explicitly unstable. > > + */ > > + > > +#include <linux/bpf.h> > > +#include <linux/btf_ids.h> > > + > > +#include <net/dst_metadata.h> > > +#include <net/xfrm.h> > > + > > +struct bpf_xfrm_info { > > + u32 if_id; > > + int link; > > +}; > > + > > +static struct metadata_dst __percpu *xfrm_md_dst; > > +__diag_push(); > > +__diag_ignore_all("-Wmissing-prototypes", > > + "Global functions as their definitions will be in xfrm_interface BTF"); > > + > > +__used noinline > > +int bpf_skb_get_xfrm_info(struct __sk_buff *skb_ctx, struct bpf_xfrm_info *to) > > +{ > > + struct sk_buff *skb = (struct sk_buff *)skb_ctx; > > + struct xfrm_md_info *info; > > + > > + memset(to, 0, sizeof(*to)); > > + > > + info = skb_xfrm_md_info(skb); > > + if (!info) > > + return -EINVAL; > > + > > + to->if_id = info->if_id; > > + to->link = info->link; > > + return 0; > > +} > > + > > +__used noinline > > +int bpf_skb_set_xfrm_info(struct __sk_buff *skb_ctx, > > + const struct bpf_xfrm_info *from) > > +{ > > + struct sk_buff *skb = (struct sk_buff *)skb_ctx; > > + struct metadata_dst *md_dst; > > + struct xfrm_md_info *info; > > + > > + if (unlikely(skb_metadata_dst(skb))) > > + return -EINVAL; > > + > > + md_dst = this_cpu_ptr(xfrm_md_dst); > > + > > + info = &md_dst->u.xfrm_info; > > + memset(info, 0, sizeof(*info)); > > + > > + info->if_id = from->if_id; > > + info->link = from->link; > > + info->dst_orig = skb_dst(skb); > However, the dst_orig init is not done under CONFIG_DST_CACHE though... > > Also, is it possible that skb->_skb_refdst has SKB_DST_NOREF set and later below > ... (contd) Nice catch! will force dst is refcounted. > > > + > > + dst_hold((struct dst_entry *)md_dst); > > + skb_dst_set(skb, (struct dst_entry *)md_dst); > > + return 0; > > +} > > + > > +__diag_pop() > > + > > +BTF_SET8_START(xfrm_ifc_kfunc_set) > > +BTF_ID_FLAGS(func, bpf_skb_get_xfrm_info) > > +BTF_ID_FLAGS(func, bpf_skb_set_xfrm_info) > > +BTF_SET8_END(xfrm_ifc_kfunc_set) > > + > > +static const struct btf_kfunc_id_set xfrm_interface_kfunc_set = { > > + .owner = THIS_MODULE, > > + .set = &xfrm_ifc_kfunc_set, > > +}; > > + > > +int __init register_xfrm_interface_bpf(void) > > +{ > > + xfrm_md_dst = metadata_dst_alloc_percpu(0, METADATA_XFRM, > > + GFP_KERNEL); > > + if (!xfrm_md_dst) > > + return -ENOMEM; > > + return register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, > > + &xfrm_interface_kfunc_set); > > Will cleanup_xfrm_interface_bpf() be called during error ? No. Will fix in v2. Thanks! Eyal.