Re: [PATCH v2 0/5] Introduce memcg_stock_pcp remote draining

Marcelo Tosatti <mtosatti@xxxxxxxxxx> · Thu, 26 Jan 2023 15:14:25 -0300

On Thu, Jan 26, 2023 at 08:45:36AM +0100, Michal Hocko wrote:
> On Wed 25-01-23 15:22:00, Marcelo Tosatti wrote:
> [...]
> > Remote draining reduces interruptions whether CPU 
> > is marked as isolated or not:
> > 
> > - Allows isolated CPUs from benefiting of pcp caching.
> > - Removes the interruption to non isolated CPUs. See for example 
> > 
> > https://lkml.org/lkml/2022/6/13/2769
> 
> This is talking about page allocato per cpu caches, right? In this patch
> we are talking about memcg pcp caches. Are you sure the same applies
> here?

Both can stall the users of the drain operation.

"Minchan Kim tested this independently and reported;

	My workload is not NOHZ CPUs but run apps under heavy memory
	pressure so they goes to direct reclaim and be stuck on
	drain_all_pages until work on workqueue run."

Therefore using a workqueue to drain memcg pcps also depends on the 
remote CPU executing that work item in time (which can stall
the following). No?

===

   7   3141  mm/memory.c <<wp_page_copy>>
             if (mem_cgroup_charge(page_folio(new_page), mm, GFP_KERNEL))
   8   4118  mm/memory.c <<do_anonymous_page>>
             if (mem_cgroup_charge(page_folio(page), vma->vm_mm, GFP_KERNEL))
   9   4577  mm/memory.c <<do_cow_fault>>
             if (mem_cgroup_charge(page_folio(vmf->cow_page), vma->vm_mm,
  10    621  mm/migrate_device.c <<migrate_vma_insert_page>>
             if (mem_cgroup_charge(page_folio(page), vma->vm_mm, GFP_KERNEL))
  11    710  mm/shmem.c <<shmem_add_to_page_cache>>
             error = mem_cgroup_charge(folio, charge_mm, gfp);