On Tue, Nov 12, 2019 at 2:38 PM Jakub Kicinski <jakub.kicinski@xxxxxxxxxxxxx> wrote: > > On Tue, 12 Nov 2019 14:03:50 -0800, Andrii Nakryiko wrote: > > On Tue, Nov 12, 2019 at 11:17 AM Jakub Kicinski wrote: > > > On Mon, 11 Nov 2019 18:06:42 -0800, Andrii Nakryiko wrote: > > > > So let's say if sizeof(struct bpf_array) is 300, then I'd have to either: > > > > > > > > - somehow make sure that I allocate 4k (for data) + 300 (for struct > > > > bpf_array) in such a way that those 4k of data are 4k-aligned. Is > > > > there any way to do that? > > > > - assuming there isn't, then another way would be to allocate entire > > > > 4k page for struct bpf_array itself, but put it at the end of that > > > > page, so that 4k of data is 4k-aligned. While wasteful, the bigger > > > > problem is that pointer to bpf_array is not a pointer to allocated > > > > memory anymore, so we'd need to remember that and adjust address > > > > before calling vfree(). > > > > > > > > Were you suggesting #2 as a solution? Or am I missing some other way to do this? > > > > > > I am suggesting #2, that's the way to do it in the kernel. > > > > So I'm concerned about this approach, because it feels like a bunch of > > unnecessarily wasted memory. While there is no way around doing > > round_up(PAGE_SIZE) for data itself, it certainly is not necessary to > > waste almost entire page for struct bpf_array. And given this is going > > to be used for BPF maps backing global variables, there most probably > > will be at least 3 (.data, .bss, .rodata) per each program, and could > > be more. Also, while on x86_64 page is 4k, on other architectures it > > can be up to 64KB, so this seems wasteful. > > With the extra mutex and int you grew struct bpf_map from 192B to 256B, > that's for every map on the system, unconditionally, and array map has > an extra pointer even if it doesn't need it. > > Increasing "wasted" space when an opt-in feature is selected doesn't > seem all that terrible to me, especially that the overhead of aligning > up map size to page size is already necessary. Well, I've been talking about one more extra page for struct bpf_array itself, on top of what we already potentially waste for mmap()'ing array data. But I went ahead and posted v3 with layout we discussed here, aligning array->value on page boundary. Let's see if you like it better. > > > What's your concern exactly with the way it's implemented in this patch? > > Judging by other threads we seem to care about performance of BPF > (rightly so). Doing an extra pointer deref for every static data access > seems like an obvious waste. > > But then again, it's just an obvious suggestion, take it or leave it..