Re: [PATCH 2/3] memcg/hugetlb: Introduce mem_cgroup_charge_hugetlb

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Nov 09, 2024 at 01:58:46PM -0500, Joshua Hahn wrote:
[...]
> 
> > > +             free_huge_folio(folio);
> >
> > free_huge_folio() will call lruvec_stat_mod_folio() unconditionally but
> > you are only calling it on success. This may underflow the metric.
> 
> I was actually thinking about this too. I was wondering what would
> make sense -- in the original draft of this patch, I had the charge
> increment be called unconditionally as well. The idea was that
> even though it would not make sense to have the stat incremented
> when there is an error, it would eventually be corrected by
> free_huge_folio's decrement. However, because there is nothing
> stopping the user from checking the stat in this period, they may
> temporarily see that the value is incremented in memory.stat,
> even though they were not able to obtain this page.
> 

On the alloc path, unconditionally calling lruvec_stat_mod_folio() after
mem_cgroup_charge_hugetlb() but before free_huge_folio() is fine. Please
note that if mem_cgroup_charge_hugetlb() has failed,
lruvec_stat_mod_folio() will not be incrementing the memcg stat. The
system level stats may get elevated for small time window but that is
fine.

> With that said, maybe it makes sense to increment unconditionally,
> if free also decrements unconditionally. This race condition is
> not something that will cause a huge problem for the user,
> although users relying on userspace monitors for memory.stat
> to handle memory management may see some problems.
> 
> Maybe what would make the most sense is to do both
> incrementing & decrementing conditionally as well.
> Thank you for your feedback, I will iterate on this for the next version!
> 
> > > +int mem_cgroup_charge_hugetlb(struct folio *folio, gfp_t gfp)
> > > +{
> > > +     struct mem_cgroup *memcg = get_mem_cgroup_from_current();
> > > +     int ret = 0;
> > > +
> > > +     if (mem_cgroup_disabled() || !memcg_accounts_hugetlb() ||
> > > +             !memcg || !cgroup_subsys_on_dfl(memory_cgrp_subsys)) {
> > > +             ret = -EOPNOTSUPP;
> >
> > why EOPNOTSUPP? You need to return 0 here. We do want
> > lruvec_stat_mod_folio() to be called.
> 
> In this case, I was just preserving the original code's return statements.
> That is, in mem_cgroup_hugetlb_try_charge, the intended behavior
> was to return -EOPNOTSUPP if any of these conditions were met.
> If I understand the code correctly, calling lruvec_stat_mod_folio()
> on this failure will be a noop, since either memcg doesn't account
> hugetlb folios / there is no memcg / memcg is disabled.
> 

lruvec_stat_mod_folio() does manipulate system level stats, so even if
memcg accounting of hugetlb is not enabled we do want system level stats
get updated.





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux