On 16/06/20 18:07, David Ahern wrote:
On 6/16/20 10:00 AM, Jesper Dangaard Brouer wrote:On Wed, 10 Jun 2020 23:09:34 +0200 Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:Federico Parola <fede.parola@xxxxxxxxxx> writes:On 06/06/20 01:34, David Ahern wrote:On 6/4/20 7:30 AM, Federico Parola wrote:Hello everybody,I'm implementing a token bucket algorithm to apply rate limit to traffic and I need the timestamp of packets to update the bucket. To get this information I'm using the bpf_ktime_get_ns() helper but I've discovered it has a non negligible impact on performance. I've seen there is work in progress to make hardware timestamps available to XDP programs, but I don't know if this feature is already available. Is there a faster way to retrieve this information?Thanks for your attention.bpf_ktime_get_ns should be fairly light. What kind of performance loss are you seeing with it?I've run some tests on a program forwarding packets between two interfaces and applying rate limit: using the bpf_ktime_get_ns() I can process up to 3.84 Mpps, if I replace the helper with a lookup on a map containing the current timestamp updated in user space I go up to 4.48 Mpps.((1/3.84*1000)-(1/4.48*1000) = 37.20 ns overhead)I had the same math yesterday and did some tests as well. I am really surprised the timestamp is that high.
Do your tests show a similar overhead?
I was about to suggest doing something close to this. That is, only call bpf_ktime_get_ns() once per NAPI poll-cycle, and store the timestamp in a map. If you don't need super high per packet precision. You can even use a per-CPU map to store the info (to avoid cross CPU cache/talk), because softirq will keep RX-processing pinned to a CPU. It sounds like you update the timestamp from userspace, is that true? (Quote: "current timestamp updated in user space") I would suggest that you can leverage the softirq tracepoints (use SEC("raw_tracepoint/") for low overhead). E.g. irq:softirq_entry (see when kernel calls trace_softirq_entry) to update the map once per NAPI/net_rx_action. I have a bpftrace based-tool[1] that measureI have code that measures the overhead of net_rx_action: https://github.com/dsahern/bpf-progs/blob/master/ksrc/net_rx_action.c this use case would just need the enter probe.network-softirq latency, e.g time it takes from "softirq_raise" until it is run "softirq_entry". You can leverage ideas from that script, like 'vec == 3' is NET_RX_SOFTIRQ to limit this to networking. [1] https://github.com/xdp-project/xdp-project/blob/master/areas/latency/softirq_net_latency.bt
Thanks for your suggestion, currently I have a thread in user space that updates a PERCPU_ARRAY map with the current timestamp every millisecond and the precision seems to be good enough.
I'll check your solution as well.
Can you share more details on the platform you're running this on? I.e., CPU and chipset details, network driver, etc.Yes, please. I plan to work on XDP-feature of extracting hardware offload-info from the drivers descriptor, like timestamps, vlan, rss-hash, checksum, etc. If you tell me what NIC driver you are using, I could make sure to include that in the supported drivers.
I ran the test on a Intel Xeon Gold 5120 @2.60GHz on a single core using a dual port 40 GbE Intel XL710 NIC (i40e driver), forwarding 64 bytes frames between the ports.
Thanks for your help. Federico