On Fri, May 13, 2022 at 12:28 PM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote: > > On 5/13/22 6:39 PM, Alexei Starovoitov wrote: > > On Fri, May 13, 2022 at 03:12:06PM +0200, Daniel Borkmann wrote: > >> > >> Thinking more about it, is there even any value for BPF_FUNC_dynptr_* for > >> fully unpriv BPF if these are rejected anyway by the spectre mitigations > >> from verifier? > > ... > >> So either for alloc, we always built-in __GFP_ZERO or bpf_dynptr_alloc() > >> helper usage should go under perfmon_capable() where it's allowed to read > >> kernel mem. > > > > dynptr should probably by cap_bpf and cap_perfmon for now. > > Otherwise we will start adding cap_perfmon checks in run-time to helpers > > which is not easy to do. Some sort of prog or user context would need > > to be passed as hidden arg into helper. That's too much hassle just > > to enable dynptr for cap_bpf only. > > > > Similar problem with gfp_account... remembering memcg and passing all > > the way to bpf_dynptr_alloc helper is not easy. And it's not clear > > which memcg to use. The one where task was that loaded that bpf prog? > > That task could have been gone and cgroup is in dying stage. > > bpf prog is executing some context and allocating memory for itself. > > Like kernel allocates memory for its needs. It doesn't feel right to > > charge prog's memcg in that case. It probably should be an explicit choice > > by bpf program author. Maybe in the future we can introduce a fake map > > for such accounting needs and bpf prog could pass a map pointer to > > bpf_dynptr_alloc. When such fake and empty map is created the memcg > > would be recorded the same way we do for existing normal maps. > > Then the helper will look like: > > bpf_dynptr_alloc(struct bpf_map *map, u32 size, u64 flags, struct bpf_dynptr *ptr) > > { > > set_active_memcg(map->memcg); > > kmalloc into dynptr; > > } > > > > Should we do this change now and allow NULL to be passed as a map ? > > Hm, this looks a bit too much like a hack, I wouldn't do that, fwiw. > > > This way the bpf prog will have a choice whether to account into memcg or not. > > Maybe it's all overkill and none of this needed? > > > > On the other side maybe map should be a mandatory argument and dynptr_alloc > > can do its own memory accounting for stats ? atomic inc and dec is probably > > an acceptable overhead? bpftool will print the dynptr allocation stats. > > All sounds nice and extra visibility is great, but the kernel code that > > allocates for the kernel doesn't use memcg. bpf progs semantically are part of > > the kernel whereas memcg is a mechanism to restrict memory that kernel > > allocated on behalf of user tasks. We abused memcg for bpf progs/maps > > to have a limit. Not clear whether we should continue doing so for dynpr_alloc > > and in the future for kptr_alloc. gfp_account adds overhead too. It's not free. > > Thoughts? > > Great question, I think the memcg is useful, just that the ownership for bpf > progs/maps has been relying on current whereas current is not a real 'owner', > just the entity which did the loading. > > Maybe we need some sort of memcg object for bpf where we can "bind" the prog > and map to it at load time, which is then different from current and can be > flexibly set, e.g. fd = open(/sys/fs/cgroup/memory/<foo>) and pass that fd to > BPF_PROG_LOAD and BPF_MAP_CREATE via bpf_attr (otherwise, if not set, then > no accounting)? > I think it would be great to have memory accounting for BPF program as a separate entity from current. BPF program is sort of like a special process w.r.t. memory that it owns. Good thing is that with bpf_run_ctx (once wired for all program types) such "ambient" entities can be easily accessed from helpers to do accounting without any verifier magic involved. > Thanks, > Daniel