Re: bpf_timer memory utilization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 20, 2023 at 10:16 AM Chris Lai <chrlai@xxxxxxxxxxxxx> wrote:
>
> Hi,
>
> In my setup, both (LRU and HASH) are preallocated.
> Kernel verson: Linux version 5.17.12-300.fc36.x86_64
> I am doing load test via load generator (Spirent) to an DUT appliance.

The kernel is a bit old.
Can you try to repro on the latest kernel?
Not saying that it won't be fixed in the old kernel,
but it will help a lot if it's still there in the latest.

> Code snippet
>
> #define MAXIMUM_CONNECTIONS 3000000
> #define CALL_BACK_TIME 60000000000
>
> struct ip_flow_tuple {
> ...
> };
>
> struct ip_flow_entry {
> ...
> struct bpf_timer timer;
> };
>
> // HASH
> struct {
> __uint(type, BPF_MAP_TYPE_HASH);
> __uint(max_entries, MAXIMUM_CONNECTIONS);
> __type(key, struct ip_flow_tuple);
> __type(value, struct ip_flow_entry);
> } flow_table __attribute__((section(".maps"), used));
>
> // LRU
> struct {
> __uint(type, BPF_MAP_TYPE_LRU_HASH);
> __uint(max_entries, MAXIMUM_CONNECTIONS);
> __type(key, struct ip_flow_tuple);
> __type(value, struct ip_flow_entry);
> } flow_table __attribute__((section(".maps"), used));

Since it's a preallocated hash map, it behaves
exactly the same way as LRU from timer destruction pov.

Could you create a selftest out of your program?
It doesn't have to be XDP and run real traffic.
The test can randomly populate a map in a loop.

> SEC("xdp")
> int testMapTimer(struct xdp_md *ctx) {
> ...
> struct ip_flow_tuple in_ip_flow_tuple = {
>    ...
> }
>
> struct ip_flow_entry *in_ip_flow_entry =
> bpf_map_lookup_elem(&flow_table, &in_ip_flow_tuple);
> if (in_ip_flow_entry == NULL) {
>     struct ip_flow_entry in_ip_flow_entry_new = {};
>     bpf_map_update_elem(&flow_table, &in_ip_flow_tuple,
> &in_ip_flow_entry_new, BPF_ANY);
>     struct ip_flow_entry *flow_entry_value =
> bpf_map_lookup_elem(&flow_table, &in_ip_flow_tuple);
>
>     if (flow_entry_value) {
>         bpf_timer_init(&flow_entry_value->timer, &flow_table, 0);
>         bpf_timer_set_callback(&flow_entry_value->timer, myTimerCallback);
>         bpf_timer_start(&flow_entry_value->timer, (__u64)CALL_BACK_TIME, 0);
>     }

Please check return values.

> }
> ...
>
> }
>
> On Fri, Mar 17, 2023 at 6:41 PM Hou Tao <houtao1@xxxxxxxxxx> wrote:
> >
> >
> >
> > On 3/18/2023 12:40 AM, Chris Lai wrote:
> > > Might be a bug using bpf_timer on Hashmap?
> > > With same setups using bpf_timer but with LRU_Hashmap, the memory
> > > usage is way better: see following
> > >
> > > with LRU_Hashmap
> > > 16M capacity, 1 minute bpf_timer callback/cleanup..  (pre-allocation
> > > ~5G),  memory usage peaked ~7G (Flat and does not fluctuate - unlike
> > > Hashmap)
> > > 32M capacity, 1 minute bpf_timer callback/cleanup..  (pre-allocation
> > > ~8G),  memory usage peaked ~12G (Flat and does not fluctuate - unlike
> > > Hashmap)

sizeo(struct bpf_hrtimer)==96
so 16M timers is ~1.5G.
... memory for LRU is 7G and for hash is 34G.. it all sounds odd.
Since according to your code snippets both maps are preallocated,
so memory consumption should be the same.
Even if all 16M timers are active it's still 1.5G which cannot explain
the difference between 7G and 34G.

My current theory is that something is wrong with htab extra_elems logic.

Do you have other part of the code when you do:
bpf_map_update_elem(&flow_table, &in_ip_flow_tuple, ...);
to update _existing_ element in place?

Do you know that map_lookup_elem() returns a pointer to value
and any changes to that value are written in place?
There is no need to do update_elem() after you changed the value.




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux