On Mon, Sep 19, 2016 at 09:19:10PM +0200, Pablo Neira Ayuso wrote: > On Mon, Sep 19, 2016 at 06:44:00PM +0200, Daniel Mack wrote: > > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c > > index 6001e78..5dc90aa 100644 > > --- a/net/ipv6/ip6_output.c > > +++ b/net/ipv6/ip6_output.c > > @@ -39,6 +39,7 @@ > > #include <linux/module.h> > > #include <linux/slab.h> > > > > +#include <linux/bpf-cgroup.h> > > #include <linux/netfilter.h> > > #include <linux/netfilter_ipv6.h> > > > > @@ -143,6 +144,7 @@ int ip6_output(struct net *net, struct sock *sk, struct sk_buff *skb) > > { > > struct net_device *dev = skb_dst(skb)->dev; > > struct inet6_dev *idev = ip6_dst_idev(skb_dst(skb)); > > + int ret; > > > > if (unlikely(idev->cnf.disable_ipv6)) { > > IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTDISCARDS); > > @@ -150,6 +152,12 @@ int ip6_output(struct net *net, struct sock *sk, struct sk_buff *skb) > > return 0; > > } > > > > + ret = cgroup_bpf_run_filter(sk, skb, BPF_CGROUP_INET_EGRESS); > > + if (ret) { > > + kfree_skb(skb); > > + return ret; > > + } > > 1) If your goal is to filter packets, why so late? The sooner you > enforce your policy, the less cycles you waste. > > Actually, did you look at Google's approach to this problem? They > want to control this at socket level, so you restrict what the process > can actually bind. That is enforcing the policy way before you even > send packets. On top of that, what they submitted is infrastructured > so any process with CAP_NET_ADMIN can access that policy that is being > applied and fetch a readable policy through kernel interface. > > 2) This will turn the stack into a nightmare to debug I predict. If > any process with CAP_NET_ADMIN can potentially attach bpf blobs > via these hooks, we will have to include in the network stack a process without CAP_NET_ADMIN can attach bpf blobs to system calls via seccomp. bpf is already used for security and policing. > traveling documentation something like: "Probably you have to check > that your orchestrator is not dropping your packets for some > reason". So I wonder how users will debug this and how the policy that > your orchestrator applies will be exposed to userspace. as far as bpf debuggability/visibility there are various efforts on the way: for kernel side: - ksym for jit-ed programs - hash sum for prog code - compact type information for maps and various pretty printers - data flow analysis of the programs for user space: - from bpf asm reconstruct the program in the high level language (there is p4 to bpf, this effort is about bpf to p4) -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html