On Sun, Nov 13, 2022 at 1:59 AM Mike Rapoport <rppt@xxxxxxxxxx> wrote: [...] > > > > There will be some memory waste in such cases. But it will get better with: > > 1) With 4/5 and 5/5, BPF programs will share this 2MB page with kernel .text > > section (_stext to _etext); > > 2) modules, ftrace, kprobe will also share this 2MB page; > > Unless I'm missing something, what will be shared is the virtual space, the > actual physical pages will be still allocated the same way as any vmalloc() > allocation. What do you mean by shared virtual space, but the actual physical pages are still the same? This is a 2MB page shared by BPF programs, modules, etc., so it is 2MB virtual address space, and it is also 1x 2MB physical huge page. For example, we will allocate one 2MB page, and put 1MB of module text, 512kB of BPF programs, some ftrace trampolines in this page. > > > 3) There are bigger BPF programs in many use cases. > > With statistics you provided above one will need hundreds if not thousands > of BPF programs to fill a 2M page. I didn't do the math, but it seems that > to see memory savings there should be several hundreds of BPF programs. powerpc is trying to use bpf_prog_pack [1]. IIUC, execmem_alloc() should allocate 512kB pages for powerpc. This already yielding memory savings: on a random system in our fleet (x86_64), BPF programs use 140x 4kB pages (or 560kB) without execmem_alloc(). They will fit in 200kB with execmem_alloc(), and we can use the other 300kB+ for modules, ftrace, etc. OTOH, 512kB or even 2MB is really small for module systems, but iTLB is always a limited resource. Thanks, Song [1] https://lore.kernel.org/bpf/20221110184303.393179-1-hbathini@xxxxxxxxxxxxx/