On Fri, Mar 19, 2021 at 05:37:05PM -0700, Hugh Dickins wrote: > On Fri, 19 Mar 2021, Johannes Weiner wrote: > > > The swapaccounting= commandline option already does very little > > today. To close a trivial containment failure case, the swap ownership > > tracking part of the swap controller has recently become mandatory > > (see commit 2d1c498072de ("mm: memcontrol: make swap tracking an > > integral part of memory control") for details), which makes up the > > majority of the work during swapout, swapin, and the swap slot map. > > > > The only thing left under this flag is the page_counter operations and > > the visibility of the swap control files in the first place, which are > > rather meager savings. There also aren't many scenarios, if any, where > > controlling the memory of a cgroup while allowing it unlimited access > > to a global swap space is a workable resource isolation stragegy. > > > > On the other hand, there have been several bugs and confusion around > > the many possible swap controller states (cgroup1 vs cgroup2 behavior, > > memory accounting without swap accounting, memcg runtime disabled). > > > > This puts the maintenance overhead of retaining the toggle above its > > practical benefits. Deprecate it. > > > > Suggested-by: Shakeel Butt <shakeelb@xxxxxxxxxx> > > Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx> > > This crashes, and needs a fix: see below (plus some nits). > > But it's a very welcome cleanup: just getting rid of all those > !cgroup_memory_noswap double negatives is a relief in itself. > > It does suggest eliminating CONFIG_MEMCG_SWAP altogether (just > using #ifdef CONFIG_SWAP instead, in those parts of CONFIG_MEMCG code); > but you're right that's a separate cleanup, and not nearly so worthwhile > as this one (I notice CONFIG_MEMCG_SWAP in some of the arch defconfigs, > and don't know whether whoever removes CONFIG_MEMCG_SWAP would be > obligated to remove those too). 2d1c498072de ("mm: memcontrol: make swap tracking an integral part of memory control") made it invisible to the user already, I only kept the symbol for convenience in the Makefile: obj-$(CONFIG_MEMCG_SWAP) += swap_cgroup.o But I guess we could replace it with ifdef CONFIG_SWAP obj-$(CONFIG_MEMCG) += swap_cgroup.c endif and I agree, everywhere else it's currently used would be nicer without it. I'll send a separate patch. > > @@ -99,7 +92,11 @@ static DECLARE_WAIT_QUEUE_HEAD(memcg_cgwb_frn_waitq); > > /* Whether legacy memory+swap accounting is active */ > > static bool do_memsw_account(void) > > { > > - return !cgroup_subsys_on_dfl(memory_cgrp_subsys) && !cgroup_memory_noswap; > > + /* cgroup2 doesn't do mem+swap accounting */ > > + if (cgroup_subsys_on_dfl(memory_cgrp_subsys)) > > + return false; > > + > > + return true; > > Nit: I'm not fond of the "if (boolean()) return true; else return false;" > codestyle, and would prefer the straightforward > > return !cgroup_subsys_on_dfl(memory_cgrp_subsys); > > but you've chosen otherwise, so, okay. I prefer a series of branches if a single expression becomes unwieldy, and individual conditions deserve their own comments. But it's indeed pretty silly in this case, I'll make it a single line. > > @@ -7124,7 +7121,7 @@ long mem_cgroup_get_nr_swap_pages(struct mem_cgroup *memcg) > > { > > long nr_swap_pages = get_nr_swap_pages(); > > > > - if (cgroup_memory_noswap || !cgroup_subsys_on_dfl(memory_cgrp_subsys)) > > + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) > > That needs to check mem_cgroup_disabled() where it used to check > cgroup_memory_noswap. The convolutions are confusing (which is > precisely why this is such a welcome cleanup), but without a > mem_cgroup_disabled() (or NULL memcg) check there, the > cgroup_disable=memory case oopses on NULLish pointer dereference > when mem_cgroup_get_nr_swap_pages() is called from get_scan_count(). > > (This little function has been repeatedly troublesome that way.) Sigh, yes, this will hopefully be the last instance of this bug... Thanks for catching that, I'll fix it up. > > @@ -7141,7 +7138,7 @@ bool mem_cgroup_swap_full(struct page *page) > > > > if (vm_swap_full()) > > return true; > > - if (cgroup_memory_noswap || !cgroup_subsys_on_dfl(memory_cgrp_subsys)) > > + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) > > Would it now be better to say "if (do_memsw_account())" there? Yes, I'll change that. > Or would it be better to eliminate do_memsw_account() altogether, > and just use !cgroup_subsys_on_dfl(memory_cgrp_subsys) throughout? I have found do_memsw_account() useful in the past to find those related pieces. The details elude me know but I remember searching for this string often during the recent work in this area. > (Though I don't find "cgroup_subsys_on_dfl" very informative.) This routinely bothers me as well, but I have not been able to come up with a good replacement. memcg_v1()? memcg_legacy_mode()? memory_cgroup_in_bytes()? > > @@ -7291,27 +7287,13 @@ static struct cftype memsw_files[] = { > > { }, /* terminate */ > > }; > > > > -/* > > - * If mem_cgroup_swap_init() is implemented as a subsys_initcall() > > - * instead of a core_initcall(), this could mean cgroup_memory_noswap still > > - * remains set to false even when memcg is disabled via "cgroup_disable=memory" > > - * boot parameter. This may result in premature OOPS inside > > - * mem_cgroup_get_nr_swap_pages() function in corner cases. > > - */ > > static int __init mem_cgroup_swap_init(void) > > { > > - /* No memory control -> no swap control */ > > - if (mem_cgroup_disabled()) > > - cgroup_memory_noswap = true; > > - > > - if (cgroup_memory_noswap) > > - return 0; > > - > > Shakeel suggested "if (mem_cgroup_disabled()) return 0;" here, > and that was the first thing I tried when I got the crash. It really isn't obviously safe. I'll put it back in.