On 3/5/21 1:11 AM, Michal Hocko wrote: > On Thu 04-03-21 09:35:08, Tim Chen wrote: >> >> >> On 2/18/21 11:13 AM, Michal Hocko wrote: >> >>> >>> Fixes: 4e41695356fb ("memory controller: soft limit reclaim on contention") >>> Acked-by: Michal Hocko <mhocko@xxxxxxxx> >>> >>> Thanks! >>>> --- >>>> mm/memcontrol.c | 6 +++++- >>>> 1 file changed, 5 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c >>>> index ed5cc78a8dbf..a51bf90732cb 100644 >>>> --- a/mm/memcontrol.c >>>> +++ b/mm/memcontrol.c >>>> @@ -3505,8 +3505,12 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order, >>>> loop > MEM_CGROUP_MAX_SOFT_LIMIT_RECLAIM_LOOPS)) >>>> break; >>>> } while (!nr_reclaimed); >>>> - if (next_mz) >>>> + if (next_mz) { >>>> + spin_lock_irq(&mctz->lock); >>>> + __mem_cgroup_insert_exceeded(next_mz, mctz, excess); >>>> + spin_unlock_irq(&mctz->lock); >>>> css_put(&next_mz->memcg->css); >>>> + } >>>> return nr_reclaimed; >>>> } >>>> >>>> -- >>>> 2.20.1 >>> >> >> Mel, >> >> Reviewing this patch a bit more, I realize that there is a chance that the removed >> next_mz could be inserted back to the tree from a memcg_check_events >> that happen in between. So we need to make sure that the next_mz >> is indeed off the tree and update the excess value before adding it >> back. Update the patch to the patch below. > > This scenario is certainly possible but it shouldn't really matter much > as __mem_cgroup_insert_exceeded bails out when the node is on the tree > already. > Makes sense. We should still update the excess value with + excess = soft_limit_excess(next_mz->memcg); + __mem_cgroup_insert_exceeded(next_mz, mctz, excess); before doing insertion. The excess value was recorded from previous mz in the loop and needs to be updated to that of next_mz. Tim