On Sat, Feb 4, 2023 at 10:01 AM John Fastabend <john.fastabend@xxxxxxxxx> wrote: > > Yafang Shao wrote: > > Get htab memory usage from the htab pointers we have allocated. Some > > small pointers are ignored as their size are quite small compared with > > the total size. > > > > The result as follows, > > - before this change > > 1: hash name count_map flags 0x0 <<<< prealloc > > key 16B value 24B max_entries 1048576 memlock 41943040B > > 2: hash name count_map flags 0x1 <<<< non prealloc, fully set > > key 16B value 24B max_entries 1048576 memlock 41943040B > > 3: hash name count_map flags 0x1 <<<< non prealloc, non set > > key 16B value 24B max_entries 1048576 memlock 41943040B > > > > The memlock is always a fixed number whatever it is preallocated or > > not, and whatever the allocated elements number is. > > > > - after this change > > 1: hash name count_map flags 0x0 <<<< prealloc > > key 16B value 24B max_entries 1048576 memlock 109064464B > > 2: hash name count_map flags 0x1 <<<< non prealloc, fully set > > key 16B value 24B max_entries 1048576 memlock 117464320B > > 3: hash name count_map flags 0x1 <<<< non prealloc, non set > > key 16B value 24B max_entries 1048576 memlock 16797952B > > > > The memlock now is hashtab actually allocated. > > > > At worst, the difference can be 10x, for example, > > - before this change > > 4: hash name count_map flags 0x0 > > key 4B value 4B max_entries 1048576 memlock 8388608B > > > > - after this change > > 4: hash name count_map flags 0x0 > > key 4B value 4B max_entries 1048576 memlock 83898640B > > > > This walks the entire map and buckets to get the size? Inside a > rcu critical section as well :/ it seems. > No, it doesn't walk the entire map and buckets, but just gets one random element. So it won't be a problem to do it inside a rcu critical section. > What am I missing, if you know how many elements are added (which > you can track on map updates) how come we can't just calculate the > memory size directly from this? > It is less accurate and hard to understand. Take non-preallocated percpu hashtab for example, The size can be calculated as follows, key_size = round_up(htab->map.key_size, 8); value_size = round_up(htab->map.value_size, 8); pcpu_meta_size = sizeof(struct llist_node) + sizeof(void *); usage = ((value_size * num_possible_cpus() +\ pcpu_meta_size + key_size) * max_entries That is quite unfriendly to the newbies, and may be error-prone. Furthermore, it is less accurate because there is underlying memory allocation in the MM subsystem. Now we can get a more accurate usage with little overhead. Why not do it? -- Regards Yafang