On 26/08/2022 13:18, Eyal Birger wrote: > On Fri, Aug 26, 2022 at 11:05 AM Nicolas Dichtel > <nicolas.dichtel@xxxxxxxxx> wrote: >> >> >> Le 25/08/2022 à 17:46, Eyal Birger a écrit : >>> Allow specifying the xfrm interface if_id and link as part of a route >>> metadata using the lwtunnel infrastructure. >>> >>> This allows for example using a single xfrm interface in collect_md >>> mode as the target of multiple routes each specifying a different if_id. >>> >>> With the appropriate changes to iproute2, considering an xfrm device >>> ipsec1 in collect_md mode one can for example add a route specifying >>> an if_id like so: >>> >>> ip route add <SUBNET> dev ipsec1 encap xfrm if_id 1 >>> >>> In which case traffic routed to the device via this route would use >>> if_id in the xfrm interface policy lookup. >>> >>> Or in the context of vrf, one can also specify the "link" property: >>> >>> ip route add <SUBNET> dev ipsec1 encap xfrm if_id 1 link_dev eth15 >>> >>> Signed-off-by: Eyal Birger <eyal.birger@xxxxxxxxx> >>> >>> ---- >>> >>> v3: netlink improvements as suggested by Nikolay Aleksandrov and >>> Nicolas Dichtel >>> >>> v2: >>> - move lwt_xfrm_info() helper to dst_metadata.h >>> - add "link" property as suggested by Nicolas Dichtel >>> --- >>> include/net/dst_metadata.h | 11 +++++ >>> include/uapi/linux/lwtunnel.h | 10 +++++ >>> net/core/lwtunnel.c | 1 + >>> net/xfrm/xfrm_interface.c | 85 +++++++++++++++++++++++++++++++++++ >>> 4 files changed, 107 insertions(+) >>> >>> diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h >>> index e4b059908cc7..57f75960fa28 100644 >>> --- a/include/net/dst_metadata.h >>> +++ b/include/net/dst_metadata.h >>> @@ -60,13 +60,24 @@ skb_tunnel_info(const struct sk_buff *skb) >>> return NULL; >>> } >>> >>> +static inline struct xfrm_md_info *lwt_xfrm_info(struct lwtunnel_state *lwt) >>> +{ >>> + return (struct xfrm_md_info *)lwt->data; >>> +} >>> + >>> static inline struct xfrm_md_info *skb_xfrm_md_info(const struct sk_buff *skb) >>> { >>> struct metadata_dst *md_dst = skb_metadata_dst(skb); >>> + struct dst_entry *dst; >>> >>> if (md_dst && md_dst->type == METADATA_XFRM) >>> return &md_dst->u.xfrm_info; >>> >>> + dst = skb_dst(skb); >>> + if (dst && dst->lwtstate && >>> + dst->lwtstate->type == LWTUNNEL_ENCAP_XFRM) >>> + return lwt_xfrm_info(dst->lwtstate); >>> + >>> return NULL; >>> } >>> >>> diff --git a/include/uapi/linux/lwtunnel.h b/include/uapi/linux/lwtunnel.h >>> index 2e206919125c..229655ef792f 100644 >>> --- a/include/uapi/linux/lwtunnel.h >>> +++ b/include/uapi/linux/lwtunnel.h >>> @@ -15,6 +15,7 @@ enum lwtunnel_encap_types { >>> LWTUNNEL_ENCAP_SEG6_LOCAL, >>> LWTUNNEL_ENCAP_RPL, >>> LWTUNNEL_ENCAP_IOAM6, >>> + LWTUNNEL_ENCAP_XFRM, >>> __LWTUNNEL_ENCAP_MAX, >>> }; >>> >>> @@ -111,4 +112,13 @@ enum { >>> >>> #define LWT_BPF_MAX_HEADROOM 256 >>> >>> +enum { >>> + LWT_XFRM_UNSPEC, >>> + LWT_XFRM_IF_ID, >>> + LWT_XFRM_LINK, >>> + __LWT_XFRM_MAX, >>> +}; >>> + >>> +#define LWT_XFRM_MAX (__LWT_XFRM_MAX - 1) >>> + >>> #endif /* _UAPI_LWTUNNEL_H_ */ >>> diff --git a/net/core/lwtunnel.c b/net/core/lwtunnel.c >>> index 9ccd64e8a666..6fac2f0ef074 100644 >>> --- a/net/core/lwtunnel.c >>> +++ b/net/core/lwtunnel.c >>> @@ -50,6 +50,7 @@ static const char *lwtunnel_encap_str(enum lwtunnel_encap_types encap_type) >>> return "IOAM6"; >>> case LWTUNNEL_ENCAP_IP6: >>> case LWTUNNEL_ENCAP_IP: >>> + case LWTUNNEL_ENCAP_XFRM: >>> case LWTUNNEL_ENCAP_NONE: >>> case __LWTUNNEL_ENCAP_MAX: >>> /* should not have got here */ >>> diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c >>> index e9a355047468..495dee8b0764 100644 >>> --- a/net/xfrm/xfrm_interface.c >>> +++ b/net/xfrm/xfrm_interface.c >>> @@ -60,6 +60,88 @@ struct xfrmi_net { >>> struct xfrm_if __rcu *collect_md_xfrmi; >>> }; >>> >>> +static const struct nla_policy xfrm_lwt_policy[LWT_XFRM_MAX + 1] = { >>> + [LWT_XFRM_IF_ID] = NLA_POLICY_MIN(NLA_U32, 1), >>> + [LWT_XFRM_LINK] = NLA_POLICY_MIN(NLA_S32, 1), >> IMHO, it would be better to keep consistency with IFLA_XFRM_LINK. >> >> $ git grep _LINK.*NLA_U32 net/ drivers/net/ >> drivers/net/gtp.c: [GTPA_LINK] = { .type = NLA_U32, }, >> drivers/net/vxlan/vxlan_core.c: [IFLA_VXLAN_LINK] = { .type = NLA_U32 }, >> ... >> net/core/rtnetlink.c: [IFLA_LINK] = { .type = NLA_U32 }, >> ... >> net/ipv4/ip_gre.c: [IFLA_GRE_LINK] = { .type = NLA_U32 }, >> net/ipv4/ip_vti.c: [IFLA_VTI_LINK] = { .type = NLA_U32 }, >> net/ipv4/ipip.c: [IFLA_IPTUN_LINK] = { .type = NLA_U32 }, >> net/ipv6/ip6_gre.c: [IFLA_GRE_LINK] = { .type = NLA_U32 }, >> net/ipv6/ip6_tunnel.c: [IFLA_IPTUN_LINK] = { .type = NLA_U32 }, >> net/ipv6/ip6_vti.c: [IFLA_VTI_LINK] = { .type = NLA_U32 }, >> net/ipv6/sit.c: [IFLA_IPTUN_LINK] = { .type = NLA_U32 }, >> net/sched/cls_u32.c: [TCA_U32_LINK] = { .type = NLA_U32 }, >> ... >> net/xfrm/xfrm_interface.c: [IFLA_XFRM_LINK] = { .type = NLA_U32 }, >> $ git grep _LINK.*NLA_S32 net/ drivers/net/ >> net/core/rtnetlink.c: [IFLA_LINK_NETNSID] = { .type = NLA_S32 }, >> $ >> >> They all are U32. Adding one S32 would just add confusion. > > Thanks for this input! > > Indeed going over the other references it seems ifindex is treated as U32 > when interfacing with userspace almost everywhere including netlink and > bpf. In the IOCTL interface it seems to be implemented as int, but at > least on my Ubuntu machine the manpage for e.g. if_nametoindex() describes > it as returning unsigned int. > > Therefore I intend to resubmit this as U32. > > Thanks, > Eyal. Ack, good point, note that ifindex is not always a u32 and ifindex itself is usually a signed integer field in structs (e.g. net_device), as well as flowic_oif (flowi_oif). :) rtnetlink.c: [IFLA_NEW_IFINDEX] = NLA_POLICY_MIN(NLA_S32, 1), rtnetlink.c: [NDA_IFINDEX] = NLA_POLICY_MIN(NLA_S32, 1), Just using the old U32 code and making link a u32 should be ok. Cheers, Nik