On Wed, Feb 16, 2022 at 04:51:14PM +0100, Sebastian Andrzej Siewior wrote: > On 2022-02-14 11:23:11 [-0500], Johannes Weiner wrote: > > Hi Sebastian, > Hi Johannes, > > > This is a bit dubious in terms of layering. It's an objcg operation, > > but what's "locked" isn't the objcg, it's the underlying stock. That > > function then looks it up again, even though we have it right there. > > > > You can open-code it and factor out the stock operation instead, and > > it makes things much simpler and clearer. > > > > I.e. something like this (untested!): > > This then: > > ------>8------ > > From: Johannes Weiner <hannes@xxxxxxxxxxx> > Date: Wed, 16 Feb 2022 13:25:49 +0100 > Subject: [PATCH] mm/memcg: Opencode the inner part of > obj_cgroup_uncharge_pages() in drain_obj_stock() > > Provide the inner part of refill_stock() as __refill_stock() without > disabling interrupts. This eases the integration of local_lock_t where > recursive locking must be avoided. > Open code obj_cgroup_uncharge_pages() in drain_obj_stock() and use > __refill_stock(). The caller of drain_obj_stock() already disables > interrupts. > > [bigeasy: Patch body around Johannes' diff ] > > Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> I thought you'd fold it into yours, but separate patch works too, thanks! Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx> One important note, though: > @@ -3151,8 +3155,17 @@ static void drain_obj_stock(struct memcg_stock_pcp *stock) > unsigned int nr_pages = stock->nr_bytes >> PAGE_SHIFT; > unsigned int nr_bytes = stock->nr_bytes & (PAGE_SIZE - 1); > > - if (nr_pages) > - obj_cgroup_uncharge_pages(old, nr_pages); > + if (nr_pages) { > + struct mem_cgroup *memcg; > + > + memcg = get_mem_cgroup_from_objcg(old); > + > + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) > + page_counter_uncharge(&memcg->kmem, nr_pages); > + __refill_stock(memcg, nr_pages); This doesn't take "memcg: add per-memcg total kernel memory stat" queued in -mm into account and so will break kmem accounting. Make sure to rebase the patches to the -mm tree before sending it out. You can find it here: https://github.com/hnaz/linux-mm