On Mon, 28 Oct 2024, David Hildenbrand wrote: > Hi Hugh, > > mostly looks good to me, one comment: Thanks... > > > +++ b/mm/memcontrol-v1.c > > @@ -848,6 +848,8 @@ static int mem_cgroup_move_account(struct folio *folio, > > css_get(&to->css); > > css_put(&from->css); > > + /* Warning should never happen, so don't worry about refcount non-0 */ > > + WARN_ON_ONCE(folio_unqueue_deferred_split(folio)); > > folio->memcg_data = (unsigned long)to; > > > > __folio_memcg_unlock(from); > > @@ -1217,7 +1219,9 @@ static int mem_cgroup_move_charge_pte_range(pmd_t > > *pmd, > > enum mc_target_type target_type; > > union mc_target target; > > struct folio *folio; > > + bool tried_split_before = false; > > +retry_pmd: > > ptl = pmd_trans_huge_lock(pmd, vma); > > if (ptl) { > > if (mc.precharge < HPAGE_PMD_NR) { > > @@ -1227,6 +1231,27 @@ static int mem_cgroup_move_charge_pte_range(pmd_t > > *pmd, > > target_type = get_mctgt_type_thp(vma, addr, *pmd, &target); > > if (target_type == MC_TARGET_PAGE) { > > folio = target.folio; > > + /* > > + * Deferred split queue locking depends on memcg, > > + * and unqueue is unsafe unless folio refcount is 0: > > + * split or skip if on the queue? first try to split. > > + */ > > + if (!list_empty(&folio->_deferred_list)) { > > + spin_unlock(ptl); > > + if (!tried_split_before) > > + split_folio(folio); > > + folio_unlock(folio); > > + folio_put(folio); > > + if (tried_split_before) > > + return 0; > > + tried_split_before = true; > > + goto retry_pmd; > > + } > > + /* > > + * So long as that pmd lock is held, the folio cannot > > + * be racily added to the _deferred_list, because > > + * __folio_remove_rmap() will find !partially_mapped. > > + */ > > Fortunately that code is getting ripped out. Yes, and even more fortunately, we're in time to fix its final incarnation! > > https://lkml.kernel.org/r/20241025012304.2473312-3-shakeel.butt@xxxxxxxxx > > So I wonder ... as a quick fix should we simply handle it like the code > further down where we refuse PTE-mapped large folios completely? (I went through the same anxiety attack as you did, wondering what happens to the large-but-not-PMD-large folios: then noticed it's safe as you did. The v1 commit message had a paragraph pondering whether the deprecated code will need a patch to extend it for the new feature: but once Shakeel posted the ripout, I ripped out that paragraph - no longer any need for an answer.) > > "ignore such a partial THP and keep it in original memcg" > > ... > > and simply skip this folio similarly? I mean, it's a corner case either way. I certainly considered that option: it's known to give up like that for many reasons. But my thinking (in the commit message) was "Not ideal, but moving charge has been requested, and khugepaged should repair the THP later" - if someone is still using move_charge_at_immigrate, I thought this change would generate fewer surprises - that huge charge likely to be moved as it used to be. Hugh