On Fri, Apr 2, 2021 at 8:27 AM Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote: > > This would be fine, because it's not a fast path or anything, but right now we > return the id using the netlink response, otherwise for query we have to open > the socket, prepare the msg, send and recv again. So it's a minor optimization. > > However, there's one other problem. In an earlier version of this series, I > didn't keep the id/index out parameters (to act as handle to the newly attached > filter/action). This lead to problems on query. Suppose a user doesn't properly > fill the opts during query (e.g. in case of filters). This means the netlink > dump includes all filters matching filled in attributes. If the prog_id for all > of these is same (e.g. all have same bpf classifier prog attached to them), it > becomes impossible to determine which one is the filter user asked for. It is > not possible to enforce filling in all kinds of attributes since some can be > left out and assigned by default in the kernel (priority, chain_index etc.). So > returning the newly created filter's id turned out to be the best option. This > is also used to stash filter related information in bpf_link to properly release > it later. > > The same problem happens with actions, where we look up using the prog_id, we > multiple actions with different index can match on same prog_id. It is not > possible to determine which index corresponds to last loaded action. > > So unless there's a better idea on how to deal with this, a query API won't work > for the case where same bpf prog is attached more than once. Returning the > id/index during attach seemed better than all other options we considered. All of these things are messy because of tc legacy. bpf tried to follow tc style with cls and act distinction and it didn't quite work. cls with direct-action is the only thing that became mainstream while tc style attach wasn't really addressed. There were several incidents where tc had tens of thousands of progs attached because of this attach/query/index weirdness described above. I think the only way to address this properly is to introduce bpf_link style of attaching to tc. Such bpf_link would support ingress/egress only. direction-action will be implied. There won't be any index and query will be obvious. So I would like to propose to take this patch set a step further from what Daniel said: int bpf_tc_attach(prog_fd, ifindex, {INGRESS,EGRESS}): and make this proposed api to return FD. To detach from tc ingress/egress just close(fd). The user processes will not conflict with each other and will not accidently detach bpf program that was attached by another user process. Such api will address the existing tc query/attach/detach race race conditions. And libbpf side of support for this api will be trivial. Single bpf link_create command with ifindex and ingress|egress arguments. wdyt?