On Tue, Nov 12, 2019 at 11:17 AM Jakub Kicinski <jakub.kicinski@xxxxxxxxxxxxx> wrote: > > On Mon, 11 Nov 2019 18:06:42 -0800, Andrii Nakryiko wrote: > > So let's say if sizeof(struct bpf_array) is 300, then I'd have to either: > > > > - somehow make sure that I allocate 4k (for data) + 300 (for struct > > bpf_array) in such a way that those 4k of data are 4k-aligned. Is > > there any way to do that? > > - assuming there isn't, then another way would be to allocate entire > > 4k page for struct bpf_array itself, but put it at the end of that > > page, so that 4k of data is 4k-aligned. While wasteful, the bigger > > problem is that pointer to bpf_array is not a pointer to allocated > > memory anymore, so we'd need to remember that and adjust address > > before calling vfree(). > > > > Were you suggesting #2 as a solution? Or am I missing some other way to do this? > > I am suggesting #2, that's the way to do it in the kernel. So I'm concerned about this approach, because it feels like a bunch of unnecessarily wasted memory. While there is no way around doing round_up(PAGE_SIZE) for data itself, it certainly is not necessary to waste almost entire page for struct bpf_array. And given this is going to be used for BPF maps backing global variables, there most probably will be at least 3 (.data, .bss, .rodata) per each program, and could be more. Also, while on x86_64 page is 4k, on other architectures it can be up to 64KB, so this seems wasteful. What's your concern exactly with the way it's implemented in this patch? > > You could make the assumption that if you're allocating memory aligned > to PAGE_SIZE, the address for vfree() is: > > addr = map; > if (map->flags & MMAPABLE) > addr = round_down(addr, PAGE_SIZE); > vfree(addr); > > Just make a note of the fact that we depend on vmalloc()s alignment in > bpf_map_area_alloc(). will add comment for that