Re: [PATCH hotfix v2 2/2] mm/thp: fix deferred split unqueue naming and locking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 28 Oct 2024, David Hildenbrand wrote:

> Hi Hugh,
> 
> mostly looks good to me, one comment:

Thanks...

> 
> > +++ b/mm/memcontrol-v1.c
> > @@ -848,6 +848,8 @@ static int mem_cgroup_move_account(struct folio *folio,
> >    css_get(&to->css);
> >    css_put(&from->css);
> >   +	/* Warning should never happen, so don't worry about refcount non-0 */
> > +	WARN_ON_ONCE(folio_unqueue_deferred_split(folio));
> >    folio->memcg_data = (unsigned long)to;
> >   
> >   	__folio_memcg_unlock(from);
> > @@ -1217,7 +1219,9 @@ static int mem_cgroup_move_charge_pte_range(pmd_t
> > *pmd,
> >    enum mc_target_type target_type;
> >    union mc_target target;
> >    struct folio *folio;
> > +	bool tried_split_before = false;
> >   +retry_pmd:
> >    ptl = pmd_trans_huge_lock(pmd, vma);
> >    if (ptl) {
> >   		if (mc.precharge < HPAGE_PMD_NR) {
> > @@ -1227,6 +1231,27 @@ static int mem_cgroup_move_charge_pte_range(pmd_t
> > *pmd,
> >     target_type = get_mctgt_type_thp(vma, addr, *pmd, &target);
> >     if (target_type == MC_TARGET_PAGE) {
> >   			folio = target.folio;
> > +			/*
> > +			 * Deferred split queue locking depends on memcg,
> > +			 * and unqueue is unsafe unless folio refcount is 0:
> > +			 * split or skip if on the queue? first try to split.
> > +			 */
> > +			if (!list_empty(&folio->_deferred_list)) {
> > +				spin_unlock(ptl);
> > +				if (!tried_split_before)
> > +					split_folio(folio);
> > +				folio_unlock(folio);
> > +				folio_put(folio);
> > +				if (tried_split_before)
> > +					return 0;
> > +				tried_split_before = true;
> > +				goto retry_pmd;
> > +			}
> > +			/*
> > +			 * So long as that pmd lock is held, the folio cannot
> > +			 * be racily added to the _deferred_list, because
> > +			 * __folio_remove_rmap() will find !partially_mapped.
> > +			 */
> 
> Fortunately that code is getting ripped out.

Yes, and even more fortunately, we're in time to fix its final incarnation!

> 
> https://lkml.kernel.org/r/20241025012304.2473312-3-shakeel.butt@xxxxxxxxx
> 
> So I wonder ... as a quick fix should we simply handle it like the code
> further down where we refuse PTE-mapped large folios completely?

(I went through the same anxiety attack as you did, wondering what
happens to the large-but-not-PMD-large folios: then noticed it's safe
as you did.  The v1 commit message had a paragraph pondering whether
the deprecated code will need a patch to extend it for the new feature:
but once Shakeel posted the ripout, I ripped out that paragraph -
no longer any need for an answer.)

> 
> "ignore such a partial THP and keep it in original memcg"
> 
> ...
> 
> and simply skip this folio similarly? I mean, it's a corner case either way.

I certainly considered that option: it's known to give up like that
for many reasons.  But my thinking (in the commit message) was "Not ideal,
but moving charge has been requested, and khugepaged should repair the THP
later" - if someone is still using move_charge_at_immigrate, I thought
this change would generate fewer surprises - that huge charge likely
to be moved as it used to be.

Hugh




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux