On Tue, Jul 19, 2022 at 12:38 PM Tejun Heo <tj@xxxxxxxxxx> wrote: > > On Tue, Jul 19, 2022 at 12:30:17PM -0700, Yosry Ahmed wrote: > > Is there a reason why these resources cannot be moved across cgroups > > dynamically? The only scenario I imagine is if you already have tmpfs > > mounted and files charged to different cgroups, but once you attribute > > tmpfs to one cgroup.charge_for.tmpfs (or sticky,..), I assume that we > > can dynamically move the resources, right? > > > > In fact, is there a reason why we can't move the tmpfs charges in that > > scenario as well? When we move processes we loop their pages tables > > and move pages and their stats, is there a reason why we wouldn't be > > able to do this with tmpfs mounts or bpf maps as well? > > Nothing is impossible but nothing is free as well. Moving charges around > traditionally caused a lot of headaches in the past and never became > reliable. There are inherent trade-offs here. You can make things more > dynamic usually by making hot paths more expensive or doing some > synchronization dancing which tends to be pretty hairy. People generally > don't wanna make hot paths slower, so we tend to end up with something > twisted which unfortunately turns out to be a headache in the long term. > > In general, I'd rather keep resource associations as static as possible. > It's okay if we do something neat inside the kernel but if we create > userspace expectation that resources can be moved around dynamically, we'll > be stuck with that for a long time likely forfeiting future simplification / > optimization opportunities. > > So, that's gonna be a fairly strong nack from my end. > Hmm, sorry I might be missing something but I don't think we have the same thing in mind? My understanding is that the sysadmin can do something like this which is relatively inexpensive to implement in the kernel: mount -t tmpfs /mnt/mymountpoint echo "/mnt/mymountpoint" > /path/to/cgroup/cgroup.charge_for.tmpfs At that point all tmpfs charges for this tmpfs are directed to /path/to/cgroup/memory.current. Then the sysadmin can do something like: echo "/mnt/mymountpoint" > /path/to/cgroup2/cgroup.charge_for.tmpfs At that point all _future_ charges of that tmpfs will go to cgroup2/memory.current. All existing charges remain at cgroup/memory.current and get uncharged from there. Per my understanding there is no need to move all the _existing_ charges from cgroup/memory.current to cgroup2/memory.current. Sorry, I don't mean to be insistent, just wanted to make sure we have the same thing in mind. Speaking for ourselves we have a very similar implementation locally and is perfectly usable (and in fact addresses a number of pain points related to shared memory charging) without dynamically moving existing charges on reassignment (the second echo in my example). > Thanks. > > -- > tejun