On Wed, Aug 14, 2024 at 10:20 AM Mike Yuan <me@xxxxxxxxxxx> wrote: > > Currently, the behavior of zswap.writeback wrt. > the cgroup hierarchy seems a bit odd. Unlike zswap.max, > it doesn't honor the value from parent cgroups. This > surfaced when people tried to globally disable zswap writeback, > i.e. reserve physical swap space only for hibernation [1] - > disabling zswap.writeback only for the root cgroup results > in subcgroups with zswap.writeback=1 still performing writeback. > > The consistency became more noticeable after I introduced > the MemoryZSwapWriteback= systemd unit setting [2] for > controlling the knob. The patch assumed that the kernel would > enforce the value of parent cgroups. It could probably be > workarounded from systemd's side, by going up the slice unit > tree and inherit the value. Yet I think it's more sensible > to make it behave consistently with zswap.max and friends. May I ask you to add/clarify this new expected behavior in Documentation/admin-guide/cgroup-v2.rst? > > [1] https://wiki.archlinux.org/title/Power_management/Suspend_and_hibernate#Disable_zswap_writeback_to_use_the_swap_space_only_for_hibernation This is an interesting use case. Never envisioned this when I developed this feature :) > [2] https://github.com/systemd/systemd/pull/31734 > > Signed-off-by: Mike Yuan <me@xxxxxxxxxxx> > --- Personally, I don't feel too strongly about this one way or another. I guess you can make a case that people want to disable zswap writeback by default, and only selectively enable it for certain descendant workloads - for convenience, they would set memory.zswap.writeback == 0 at root, then enable it on selected descendants? It's not super expensive IMHO - we already perform upward traversal on every zswap store. This wouldn't be the end of the world. Yosry, Johannes - how do you two feel about this? Code looks solid to me - I think the upward tree traversal should be safe, as long as memcg is valid (since memcg holds reference to its parent IIRC).