On Thu, Mar 26, 2020 at 09:25:52PM -0700, Joe Stringer wrote: > Add support for TPROXY via a new bpf helper, bpf_sk_assign(). > > This helper requires the BPF program to discover the socket via a call > to bpf_sk*_lookup_*(), then pass this socket to the new helper. The > helper takes its own reference to the socket in addition to any existing > reference that may or may not currently be obtained for the duration of > BPF processing. For the destination socket to receive the traffic, the > traffic must be routed towards that socket via local route. The > simplest example route is below, but in practice you may want to route > traffic more narrowly (eg by CIDR): > > $ ip route add local default dev lo > > This patch avoids trying to introduce an extra bit into the skb->sk, as > that would require more invasive changes to all code interacting with > the socket to ensure that the bit is handled correctly, such as all > error-handling cases along the path from the helper in BPF through to > the orphan path in the input. Instead, we opt to use the destructor > variable to switch on the prefetch of the socket. > > Signed-off-by: Joe Stringer <joe@xxxxxxxxxxx> > --- > v3: Check skb_sk_is_prefetched() in TC level redirect check > v2: Use skb->destructor to determine socket prefetch usage instead of > introducing a new metadata_dst > Restrict socket assign to same netns as TC device > Restrict assigning reuseport sockets > Adjust commit wording > v1: Initial version > --- > include/net/sock.h | 7 +++++++ > include/uapi/linux/bpf.h | 25 ++++++++++++++++++++++++- > net/core/filter.c | 31 +++++++++++++++++++++++++++++++ > net/core/sock.c | 9 +++++++++ > net/ipv4/ip_input.c | 3 ++- > net/ipv6/ip6_input.c | 3 ++- > net/sched/act_bpf.c | 3 +++ > tools/include/uapi/linux/bpf.h | 25 ++++++++++++++++++++++++- > 8 files changed, 102 insertions(+), 4 deletions(-) > [ ... ] > diff --git a/net/core/sock.c b/net/core/sock.c > index 0fc8937a7ff4..cfaf60267360 100644 > --- a/net/core/sock.c > +++ b/net/core/sock.c > @@ -2071,6 +2071,15 @@ void sock_efree(struct sk_buff *skb) > } > EXPORT_SYMBOL(sock_efree); > > +/* Buffer destructor for prefetch/receive path where reference count may > + * not be held, e.g. for listen sockets. > + */ > +void sock_pfree(struct sk_buff *skb) > +{ > + sock_edemux(skb); Nit. may be directly call sock_gen_put(). > +} > +EXPORT_SYMBOL(sock_pfree); > +