On Thu, Aug 10, 2023 at 12:41:01PM -0700, Martin KaFai Lau wrote: > > > > > First, I'm a bit concerned about implicit restrictions we apply to bpf programs > > > > > which will be executed potentially thousands times under a very heavy memory > > > > > pressure. We will need to make sure that they don't allocate (much) memory, don't > > > > > take any locks which might deadlock with other memory allocations etc. > > > > > It will potentially require hard restrictions on what these programs can and can't > > > > > do and this is something that the bpf community will have to maintain long-term. > > > > > > > > Right, BPF callbacks operating under OOM situations will be really > > > > constrained but this is more or less by definition. Isn't it? > > > > > > What do you mean? > > > > Callbacks cannot depend on any direct or indirect memory allocations. > > Dependencies on any sleeping locks (again directly or indirectly) is not > > allowed just to name the most important ones. > > > > > In general, the bpf community is trying to make it as generic as possible and > > > adding new and new features. Bpf programs are not as constrained as they were > > > when it's all started. > > bpf supports different running context. For example, only non-sleepable bpf > prog is allowed to run at the NIC driver. A sleepable bpf prog is only > allowed to run at some bpf_lsm hooks that is known to be safe to call > blocking bpf-helper/kfunc. From the bpf side, it ensures a non-sleepable bpf > prog cannot do things that may block. Yeah, you're right: non-sleepable bpf should be ok here. > > fwiw, Dave has recently proposed something for iterating the task vma > (https://lore.kernel.org/bpf/20230810183513.684836-4-davemarchevsky@xxxxxx/). > Potentially, a similar iterator can be created for a bpf program to iterate > cgroups and tasks. Yes, it looks like a much better approach rather than adding a hook into the existing iteration over all tasks. Thanks!