Re: [PATCH bpf-next v2 00/14] bpf qdisc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/20/24 11:55 AM, Amery Hung wrote:
The implementation of bpf_fq is fairly complex and slightly different from
fq so later we only compare the two fifo qdiscs. bpf_fq implements the
same fair queueing algorithm in fq, but without flow hash collision
avoidance and garbage collection of inactive flows. bpf_fifo uses a single

For hash collision, I think you meant >1 tcp_socks having the same hash in patch 14? This probably could be detected by adding the sk pointer value to the bpf-map key? not asking to change patch 14 though.

For garbage collection, I think patch 14 has it but yes it is iterating the bpf map, so not as quick as doing gc while searching for the sk in the rbtree. I think the only missing piece is being able to iterate the bpf_rb_root, i.e. able to directly search left and right of a bpf_rb_node.

bpf_list as a queue instead of three queues for different priorities in
pfifo_fast. The time complexity of fifo however should be similar since the
queue selection time is negligible.

Test setup:

     client -> qdisc ------------->  server
     ~~~~~~~~~~~~~~~                 ~~~~~~
     nested VM1 @ DC1               VM2 @ DC2

Throghput: iperf3 -t 600, 5 times

       Qdisc        Average (GBits/sec)
     ----------     -------------------
     pfifo_fast       12.52 ± 0.26
     bpf_fifo         11.72 ± 0.32
     fq               10.24 ± 0.13
     bpf_fq           11.92 ± 0.64

Latency: sockperf pp --tcp -t 600, 5 times

       Qdisc        Average (usec)
     ----------     --------------
     pfifo_fast      244.58 ± 7.93
     bpf_fifo        244.92 ± 15.22
     fq              234.30 ± 19.25
     bpf_fq          221.34 ± 10.76

Looking at the two fifo qdiscs, the 6.4% drop in throughput in the bpf
implementatioin is consistent with previous observation (v8 throughput
test on a loopback device). This should be able to be mitigated by
supporting adding skb to bpf_list or bpf_rbtree directly in the future.

* Clean up skb in bpf qdisc during reset *

The current implementation relies on bpf qdisc implementors to correctly
release skbs in queues (bpf graphs or maps) in .reset, which might not be
a safe thing to do. The solution as Martin has suggested would be
supporting private data in struct_ops. This can also help simplifying
implementation of qdisc that works with mq. For examples, qdiscs in the
selftest mostly use global data. Therefore, even if user add multiple
qdisc instances under mq, they would still share the same queue.

Although not as nice as priv_data, I think mq setup with a dedicated queue can be done with bpf map-in-map.

For the cleanup part, it is similar to how the bpf kptr is cleaned up, either the bpf program frees it or the bpf infra will eventually clean it up during the bpf map destruction.

For priv_data, I think it could be a useful addition to the bpf_struct_ops. Meaning it should also work for struct_ops other than Qdisc_ops. Then all destruction and free could be done more automatically and seamlessly.

imo, the above improvements can be iterated later on top of the core pieces of this set.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux