On Mon, Sep 19, 2016 at 09:30:02PM +0200, Daniel Mack wrote: > On 09/19/2016 09:19 PM, Pablo Neira Ayuso wrote: > > On Mon, Sep 19, 2016 at 06:44:00PM +0200, Daniel Mack wrote: > >> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c > >> index 6001e78..5dc90aa 100644 > >> --- a/net/ipv6/ip6_output.c > >> +++ b/net/ipv6/ip6_output.c > >> @@ -39,6 +39,7 @@ > >> #include <linux/module.h> > >> #include <linux/slab.h> > >> > >> +#include <linux/bpf-cgroup.h> > >> #include <linux/netfilter.h> > >> #include <linux/netfilter_ipv6.h> > >> > >> @@ -143,6 +144,7 @@ int ip6_output(struct net *net, struct sock *sk, struct sk_buff *skb) > >> { > >> struct net_device *dev = skb_dst(skb)->dev; > >> struct inet6_dev *idev = ip6_dst_idev(skb_dst(skb)); > >> + int ret; > >> > >> if (unlikely(idev->cnf.disable_ipv6)) { > >> IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTDISCARDS); > >> @@ -150,6 +152,12 @@ int ip6_output(struct net *net, struct sock *sk, struct sk_buff *skb) > >> return 0; > >> } > >> > >> + ret = cgroup_bpf_run_filter(sk, skb, BPF_CGROUP_INET_EGRESS); > >> + if (ret) { > >> + kfree_skb(skb); > >> + return ret; > >> + } > > > > 1) If your goal is to filter packets, why so late? The sooner you > > enforce your policy, the less cycles you waste. > > > > Actually, did you look at Google's approach to this problem? They > > want to control this at socket level, so you restrict what the process > > can actually bind. That is enforcing the policy way before you even > > send packets. On top of that, what they submitted is infrastructured > > so any process with CAP_NET_ADMIN can access that policy that is being > > applied and fetch a readable policy through kernel interface. > > Yes, I've seen what they propose, but I want this approach to support > accounting, and so the code has to look at each and every packet in > order to count bytes and packets. Do you know of any better place to put > the hook then? Accounting is part of the usecase that fits into the "network introspection" idea that has been mentioned here, so you can achieve this by adding a hook that returns no verdict, so this becomes similar to the tracing infrastructure. > That said, I can well imagine more hooks types that also operate at port > bind time. That would be easy to add on top. Filtering packets with cgroups is braindead. You have the means to ensure that processes send no packets via restricting port binding, there is no reason to do this any later for locally generated traffic. -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html