On 01/17, Amery Hung wrote: > Hi, > > I am continuing the work of ebpf-based Qdisc based on Cong’s previous > RFC. The followings are some use cases of eBPF Qdisc: > > 1. Allow customizing Qdiscs in an easier way. So that people don't > have to write a complete Qdisc kernel module just to experiment > some new queuing theory. > > 2. Solve EDT's problem. EDT calcuates the "tokens" in clsact which > is before enqueue, it is impossible to adjust those "tokens" after > packets get dropped in enqueue. With eBPF Qdisc, it is easy to > be solved with a shared map between clsact and sch_bpf. > > 3. Replace qevents, as now the user gains much more control over the > skb and queues. > > 4. Provide a new way to reuse TC filters. Currently TC relies on filter > chain and block to reuse the TC filters, but they are too complicated > to understand. With eBPF helper bpf_skb_tc_classify(), we can invoke > TC filters on _any_ Qdisc (even on a different netdev) to do the > classification. > > 5. Potentially pave a way for ingress to queue packets, although > current implementation is still only for egress. > > I’ve combed through previous comments and appreciated the feedbacks. > Some major changes in this RFC is the use of kptr to skb to maintain > the validility of skb during its lifetime in the Qdisc, dropping rbtree > maps, and the inclusion of two examples. > > Some questions for discussion: > > 1. We now pass a trusted kptr of sk_buff to the program instead of > __sk_buff. This makes most helpers using __sk_buff incompatible > with eBPF qdisc. An alternative is to still use __sk_buff in the > context and use bpf_cast_to_kern_ctx() to acquire the kptr. However, > this can only be applied to enqueue program, since in dequeue program > skbs do not come from ctx but kptrs exchanged out of maps (i.e., there > is no __sk_buff). Any suggestion for making skb kptr and helper > functions compatible? > > 2. The current patchset uses netlink. Do we also want to use bpf_link > for attachment? [..] > 3. People have suggested struct_ops. We chose not to use struct_ops since > users might want to create multiple bpf qdiscs with different > implementations. Current struct_ops attachment model does not seem > to support replacing only functions of a specific instance of a module, > but I might be wrong. I still feel like it deserves at leasta try. Maybe we can find some potential path where struct_ops can allow different implementations (Martin probably has some ideas about that). I looked at the bpf qdisc itself and it doesn't really have anything complicated (besides trying to play nicely with other tc classes/actions, but I'm not sure how relevant that is). With struct_ops you can also get your (2) addressed.