On 10/9/20 3:13 AM, Toke Høiland-Jørgensen wrote: > The bpf_fib_lookup() helper performs a neighbour lookup for the destination > IP and returns BPF_FIB_LKUP_NO_NEIGH if this fails, with the expectation > that the BPF program will pass the packet up the stack in this case. > However, with the addition of bpf_redirect_neigh() that can be used instead > to perform the neighbour lookup, at the cost of a bit of duplicated work. > > For that we still need the target ifindex, and since bpf_fib_lookup() > already has that at the time it performs the neighbour lookup, there is > really no reason why it can't just return it in any case. So let's just > always return the ifindex, and also add a flag that lets the caller turn > off the neighbour lookup entirely in bpf_fib_lookup(). seems really odd to do the fib lookup only to skip the neighbor lookup and defer to a second helper to do a second fib lookup and send out. The better back-to-back calls is to return the ifindex and gateway on successful fib lookup regardless of valid neighbor. If the call to bpf_redirect_neigh is needed, it can have a flag to skip the fib lookup and just redirect to the given nexthop address + ifindex. ie., bpf_redirect_neigh only does neighbor handling in this case. > > v2: > - Add flag (Daniel) > - Remove misleading code example from commit message (David) > > Cc: David Ahern <dsahern@xxxxxxxxx> > Signed-off-by: Toke Høiland-Jørgensen <toke@xxxxxxxxxx> > --- > include/uapi/linux/bpf.h | 10 ++++++---- > net/core/filter.c | 15 ++++++++++++--- > tools/include/uapi/linux/bpf.h | 10 ++++++---- > 3 files changed, 24 insertions(+), 11 deletions(-) > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index d83561e8cd2c..9c7c10ce7ace 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -4813,12 +4813,14 @@ struct bpf_raw_tracepoint_args { > __u64 args[0]; > }; > > -/* DIRECT: Skip the FIB rules and go to FIB table associated with device > - * OUTPUT: Do lookup from egress perspective; default is ingress > +/* DIRECT: Skip the FIB rules and go to FIB table associated with device > + * OUTPUT: Do lookup from egress perspective; default is ingress > + * SKIP_NEIGH: Skip neighbour lookup and return BPF_FIB_LKUP_RET_NO_NEIGH on success > */ > enum { > - BPF_FIB_LOOKUP_DIRECT = (1U << 0), > - BPF_FIB_LOOKUP_OUTPUT = (1U << 1), > + BPF_FIB_LOOKUP_DIRECT = (1U << 0), > + BPF_FIB_LOOKUP_OUTPUT = (1U << 1), > + BPF_FIB_LOOKUP_SKIP_NEIGH = (1U << 2), > }; > > enum { > diff --git a/net/core/filter.c b/net/core/filter.c > index 05df73780dd3..1038337bc06c 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -5192,7 +5192,6 @@ static int bpf_fib_set_fwd_params(struct bpf_fib_lookup *params, > memcpy(params->smac, dev->dev_addr, ETH_ALEN); > params->h_vlan_TCI = 0; > params->h_vlan_proto = 0; > - params->ifindex = dev->ifindex; > > return 0; > } > @@ -5289,6 +5288,10 @@ static int bpf_ipv4_fib_lookup(struct net *net, struct bpf_fib_lookup *params, > dev = nhc->nhc_dev; > > params->rt_metric = res.fi->fib_priority; > + params->ifindex = dev->ifindex; > + > + if (flags & BPF_FIB_LOOKUP_SKIP_NEIGH) > + return BPF_FIB_LKUP_RET_NO_NEIGH; > > /* xdp and cls_bpf programs are run in RCU-bh so > * rcu_read_lock_bh is not needed here > @@ -5414,6 +5417,10 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params, > > dev = res.nh->fib_nh_dev; > params->rt_metric = res.f6i->fib6_metric; > + params->ifindex = dev->ifindex; > + > + if (flags & BPF_FIB_LOOKUP_SKIP_NEIGH) > + return BPF_FIB_LKUP_RET_NO_NEIGH; > > /* xdp and cls_bpf programs are run in RCU-bh so rcu_read_lock_bh is > * not needed here. > @@ -5432,7 +5439,8 @@ BPF_CALL_4(bpf_xdp_fib_lookup, struct xdp_buff *, ctx, > if (plen < sizeof(*params)) > return -EINVAL; > > - if (flags & ~(BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT)) > + if (flags & ~(BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT | > + BPF_FIB_LOOKUP_SKIP_NEIGH)) > return -EINVAL; > > switch (params->family) { > @@ -5469,7 +5477,8 @@ BPF_CALL_4(bpf_skb_fib_lookup, struct sk_buff *, skb, > if (plen < sizeof(*params)) > return -EINVAL; > > - if (flags & ~(BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT)) > + if (flags & ~(BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT | > + BPF_FIB_LOOKUP_SKIP_NEIGH)) > return -EINVAL; > > switch (params->family) { > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h > index d83561e8cd2c..9c7c10ce7ace 100644 > --- a/tools/include/uapi/linux/bpf.h > +++ b/tools/include/uapi/linux/bpf.h > @@ -4813,12 +4813,14 @@ struct bpf_raw_tracepoint_args { > __u64 args[0]; > }; > > -/* DIRECT: Skip the FIB rules and go to FIB table associated with device > - * OUTPUT: Do lookup from egress perspective; default is ingress > +/* DIRECT: Skip the FIB rules and go to FIB table associated with device > + * OUTPUT: Do lookup from egress perspective; default is ingress > + * SKIP_NEIGH: Skip neighbour lookup and return BPF_FIB_LKUP_RET_NO_NEIGH on success > */ > enum { > - BPF_FIB_LOOKUP_DIRECT = (1U << 0), > - BPF_FIB_LOOKUP_OUTPUT = (1U << 1), > + BPF_FIB_LOOKUP_DIRECT = (1U << 0), > + BPF_FIB_LOOKUP_OUTPUT = (1U << 1), > + BPF_FIB_LOOKUP_SKIP_NEIGH = (1U << 2), > }; > > enum { >