On 04/10/2024 13:13, Daniel Borkmann wrote: > Jordan reported that when running Cilium with netkit in per-endpoint-routes > mode, network policy misclassifies traffic. In this direct routing mode > of Cilium which is used in case of GKE/EKS/AKS, the Pod's BPF program to > enforce policy sits on the netkit primary device's egress side. > > The issue here is that in case of netkit's netkit_prep_forward(), it will > clear meta data such as skb->mark and skb->priority before executing the > BPF program. Thus, identity data stored in there from earlier BPF programs > (e.g. from tcx ingress on the physical device) gets cleared instead of > being made available for the primary's program to process. While for traffic > egressing the Pod via the peer device this might be desired, this is > different for the primary one where compared to tcx egress on the host > veth this information would be available. > > To address this, add a new parameter for the device orchestration to > allow control of skb->mark and skb->priority scrubbing, to make the two > accessible from BPF (and eventually leave it up to the program to scrub). > By default, the current behavior is retained. For netkit peer this also > enables the use case where applications could cooperate/signal intent to > the BPF program. > > Note that struct netkit has a 4 byte hole between policy and bundle which > is used here, in other words, struct netkit's first cacheline content used > in fast-path does not get moved around. > > Fixes: 35dfaad7188c ("netkit, bpf: Add bpf programmable net device") > Reported-by: Jordan Rife <jrife@xxxxxxxxxx> > Signed-off-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx> > Cc: Nikolay Aleksandrov <razor@xxxxxxxxxxxxx> > Link: https://github.com/cilium/cilium/issues/34042 > --- > v1 -> v2: > - Use NLA_POLICY_MAX (Jakub) > - Document scrub behavior in if_link.h uapi header (Jakub) > > drivers/net/netkit.c | 68 +++++++++++++++++++++++++++++------- > include/uapi/linux/if_link.h | 15 ++++++++ > 2 files changed, 70 insertions(+), 13 deletions(-) > Acked-by: Nikolay Aleksandrov <razor@xxxxxxxxxxxxx>