Re: Override default socket policy per cgroup

Martin KaFai Lau <kafai@xxxxxx> · Wed, 9 Feb 2022 15:40:33 -0800



On Wed, Feb 09, 2022 at 01:51:38PM -0800, sdf@xxxxxxxxxx wrote:
> On 02/09, Martin KaFai Lau wrote:
> > On Wed, Feb 09, 2022 at 09:03:45AM -0800, sdf@xxxxxxxxxx wrote:
> > > Let's say I want to set some default sk_priority for all sockets in a
> > > specific cgroup. I can do it right now using cgroup/sock_create, but it
> > > applies only to AF_INET{,6} sockets. I'd like to do the same for raw
> > > (AF_PACKET) sockets and cgroup/sock_create doesn't trigger for them :-(
> > Other than AF_PACKET and INET[6], do you have use cases for other
> > families?
> 
> No, I only need AF_PACKET for now. But I feel like we should create
> a more extensible hook point this time (if we go this route).
> 
> > > (1) My naive approach would be to add another cgroup/sock_post_create
> > > which runs late from __sock_create and triggers on everything.
> > >
> > > (2) Another approach might be to move BPF_CGROUP_RUN_PROG_INET_SOCK and
> > > make it work with AF_PACKET. This might be not 100% backwards compatible
> > > but I'd assume that most users should look at the socket family before
> > > doing anything. (in this case it feels like we can extend
> > > sock_bind/release for af_packets as well, just for accounting purposes,
> > > without any way to override the target ifindex).
> > If adding a hook at __sock_create, I think having a new
> > CGROUP_POST_SOCK_CREATE
> > may be better instead of messing with the current inet assumption
> > in CGROUP_'INET'_SOCK_CREATE.  Running all CGROUP_*_SOCK_CREATE at
> > __sock_create could be a nice cleanup such that a few lines can be
> > removed from inet[6]_create but an extra family check will be needed.
> 
> SG. Hopefully I can at least reuse exiting progtype and just introduce
> new hook point in __sock_create.
> 
> > The bpf prog has both bpf_sock->family and bpf_sock->protocol field to
> > check with, so it should be able to decide the sk type if it is run
> > at __sock_create.  All bpf_sock fields should make sense or at least 0
> > to all families (?), please check.
> 
> Yeah, that's what I think as well, existing bpf_sock should work
> as is (it might show empty ip/port for af_packet), but I'll do verify
> that.
> 
> > For af_packet bind, the ip[46]/port probably won't be useful?  What
> > the bpf prog will need?
> 
> For AF_PACKET bind we would need new ifindex and new protocol. I was
> thinking
> maybe new bpf_packet_sock type+helper to convert from bpf_sock is the
> way to go here.
Right, should follow the existing bpf_skc_to_*() and
RET_PTR_TO_BTF_ID_OR_NULL pattern to return a 'struct packet_sock *'.

> For AF_PACKET bind we actually have another use-case where I think
> generic bind hook might be helpful. I have a working prototype with
> fmod_ret,
> but feels like per-cgroup hook is better (let's me access cgroup local
> storage):
> We'd like to have a cgroup-enforced TX-only form of raw socket (grant
> CAP_NET_RAW+restrict RX path). For AF_INET{,6} it means allow only
> socket(AF_INET{,6}, SOCK_RAW, IPPROTO_RAW); that's easily enforcible with
> the current hooks. For AF_PACKET it means allow only
> socket(AF_PACKET, SOCK_RAW, 0 == ETH_P_NONE) and prohibit bind to protocol
> != 0.
Meaning a generic hook for bind also?
hmm... yeah, instead of adding a new one for AF_PACKET, adding a generic
one may be more useful.
Just noticed there are INET4_POST_BIND and INET6_POST_BIND
instead of one INET_POST_BIND.  It may be worth checking if it was due to some
bummer in the sock.  A quick look seems to be fine, the addrs in the sock are
not overlapped in a union.