On 4/19/21 2:18 PM, Kumar Kartikeya Dwivedi wrote:
This adds functions that wrap the netlink API used for adding, manipulating, and removing traffic control filters. These functions operate directly on the loaded prog's fd, and return a handle to the filter using an out parameter named id. The basic featureset is covered to allow for attaching, manipulation of properties, and removal of filters. Some additional features like TCA_BPF_POLICE and TCA_RATE for tc_cls have been omitted. These can added on top later by extending the bpf_tc_cls_opts struct. Support for binding actions directly to a classifier by passing them in during filter creation has also been omitted for now. These actions have an auto clean up property because their lifetime is bound to the filter they are attached to. This can be added later, but was omitted for now as direct action mode is a better alternative to it, which is enabled by default. An API summary: bpf_tc_act_{attach, change, replace} may be used to attach, change, and
typo on bpf_tc_act_{...} ? ^^^
replace SCHED_CLS bpf classifier. The protocol field can be set as 0, in which case it is subsitituted as ETH_P_ALL by default.
Do you have an actual user that needs anything other than ETH_P_ALL? Why is it even needed? Why not stick to just ETH_P_ALL?
The behavior of the three functions is as follows: attach = create filter if it does not exist, fail otherwise change = change properties of the classifier of existing filter replace = create filter, and replace any existing filter
This touches on tc oddities quite a bit. Why do we need to expose them? Can't we simplify/abstract this e.g. i) create or update instance, ii) delete instance, iii) get instance ? What concrete use case do you have that you need those three above?
bpf_tc_cls_detach may be used to detach existing SCHED_CLS filter. The bpf_tc_cls_attach_id object filled in during attach, change, or replace must be passed in to the detach functions for them to remove the filter and its attached classififer correctly. bpf_tc_cls_get_info is a helper that can be used to obtain attributes for the filter and classififer. The opts structure may be used to choose the granularity of search, such that info for a specific filter corresponding to the same loaded bpf program can be obtained. By default, the first match is returned to the user. Examples: struct bpf_tc_cls_attach_id id = {}; struct bpf_object *obj; struct bpf_program *p; int fd, r; obj = bpf_object_open("foo.o"); if (IS_ERR_OR_NULL(obj)) return PTR_ERR(obj); p = bpf_object__find_program_by_title(obj, "classifier"); if (IS_ERR_OR_NULL(p)) return PTR_ERR(p); if (bpf_object__load(obj) < 0) return -1; fd = bpf_program__fd(p); r = bpf_tc_cls_attach(fd, if_nametoindex("lo"), BPF_TC_CLSACT_INGRESS, NULL, &id); if (r < 0) return r; ... which is roughly equivalent to (after clsact qdisc setup): # tc filter add dev lo ingress bpf obj foo.o sec classifier da ... as direct action mode is always enabled. If a user wishes to modify existing options on an attached classifier, bpf_tc_cls_change API may be used. Only parameters class_id can be modified, the rest are filled in to identify the correct filter. protocol can be left out if it was not chosen explicitly (defaulting to ETH_P_ALL). Example: /* Optional parameters necessary to select the right filter */ DECLARE_LIBBPF_OPTS(bpf_tc_cls_opts, opts, .handle = id.handle, .priority = id.priority, .chain_index = id.chain_index)
Why do we need chain_index as part of the basic API?
opts.class_id = TC_H_MAKE(1UL << 16, 12); r = bpf_tc_cls_change(fd, if_nametoindex("lo"), BPF_TC_CLSACT_INGRESS, &opts, &id);
Also, I'm not sure whether the prefix should even be named bpf_tc_cls_*() tbh, yes, despite being "low level" api. I think in the context of bpf we should stop regarding this as 'classifier' and 'action' objects since it's really just a single entity and not separate ones. It's weird enough to explain this concept to new users and if a libbpf based api could cleanly abstract it, I would be all for it. I don't think we need to map 1:1 the old tc legacy even in the low level api, tbh, as it feels backwards. I think the 'handle' & 'priority' bits are okay, but I would remove the others.
if (r < 0) return r; struct bpf_tc_cls_info info = {}; r = bpf_tc_cls_get_info(fd, if_nametoindex("lo"), BPF_TC_CLSACT_INGRESS, &opts, &info); if (r < 0) return r; assert(info.class_id == TC_H_MAKE(1UL << 16, 12)); This would be roughly equivalent to doing: # tc filter change dev lo egress prio <p> handle <h> bpf obj foo.o sec \ classifier classid 1:12
Why even bother to support !da mode, what are you trying to solve with it? I don't think official libbpf api should support something that doesn't scale.
... except a new bpf program will be loaded and replace existing one. Reviewed-by: Toke Høiland-Jørgensen <toke@xxxxxxxxxx> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx>