Re: [PATCH v4 3/9] mm: pagewalk: Don't split transhuge pmds when a pmd_entry is present

Thomas Hellström (VMware) <thomas_os@xxxxxxxxxxxx> · Thu, 10 Oct 2019 08:15:03 +0200

On 10/10/19 4:07 AM, Linus Torvalds wrote:
On Wed, Oct 9, 2019 at 6:10 PM Thomas Hellström (VMware)
<thomas_os@xxxxxxxxxxxx> wrote:
Your original patch does exactly the same!
Oh, no. You misread my original patch.

Look again.

The logic in my original patch was very different. It said that

  - *if* we have a pmd_entry function, then we obviously call that one.

     And if - after calling the pmd_entry function - we are still a
hugepage, then we skip the pte_entry case entirely.

    And part of skipping is obviously "don't split" - but it never had
that "don't split and then call the pte walker" case.

  - and if we *don't* have a pmd_entry function, but we do have a
pte_entry function, then we always split before calling it.

Notice the difference?

From what I can tell, my patch is doing the same. At least that always 
was the intention. To determine whether to skip pte and skip split, your 
patch uses

                        /* No pte level at all? */
                        if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd)
|| pmd_devmap(*pmd))
                                continue;

whereas my patch does

                        if (pmd_trans_unstable(pmd))
                                goto again;
			/* Fall through */

which is the same (pmd_trans_unstable() is the same test as you do, but 
not racy). Yes, it's missing the test for pmd_devmap() but I think 
that's an mm bug been discussed elsewhere, and we also rerun because a 
huge / none pmd at this (FALLBACK) point is probably a race and unintended.

But I think the "change pmd_entry to have a sane return code" is a
simpler and more flexible model, and then the pmd_entry code can just
let the pte walker split the pmd if needed.

OK, let's aim for that then.

Thanks,

Thomas

So I liked that part of your patch.

            Linus