On Fri, Aug 19, 2022 at 6:20 AM Tejun Heo <tj@xxxxxxxxxx> wrote: > > Hello, > > On Thu, Aug 18, 2022 at 02:31:06PM +0000, Yafang Shao wrote: > > After switching to memcg-based bpf memory accounting to limit the bpf > > memory, some unexpected issues jumped out at us. > > 1. The memory usage is not consistent between the first generation and > > new generations. > > 2. After the first generation is destroyed, the bpf memory can't be > > limited if the bpf maps are not preallocated, because they will be > > reparented. > > > > This patchset tries to resolve these issues by introducing an > > independent memcg to limit the bpf memory. > > memcg folks would have better informed opinions but from generic cgroup pov > I don't think this is a good direction to take. This isn't a problem limited > to bpf progs and it doesn't make whole lot of sense to solve this for bpf. > This change is bpf specific. It doesn't refactor a whole lot of things. > We have the exact same problem for any resources which span multiple > instances of a service including page cache, tmpfs instances and any other > thing which can persist longer than procss life time. My current opinion is > that this is best solved by introducing an extra cgroup layer to represent > the persistent entity and put the per-instance cgroup under it. > It is not practical on k8s. Because, before the persistent entity, the cgroup dir is stateless. After, it is stateful. Pls, don't continue keeping blind eyes on k8s. > It does require reorganizing how things are organized from userspace POV but > the end result is really desirable. We get entities accurately representing > what needs to be tracked and control over the granularity of accounting and > control (e.g. folks who don't care about telling apart the current > instance's usage can simply not enable controllers at the persistent entity > level). > Pls.s also think about why k8s refuse to use cgroup2. > We surely can discuss other approaches but my current intuition is that it'd > be really difficult to come up with a better solution than layering to > introduce persistent service entities. > > So, please consider the approach nacked for the time being. > It doesn't make sense to nack it. I will explain to you by replying to your other email. -- Regards Yafang