Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 4/28/19 12:13 PM, Yang Shi wrote:


On 4/23/19 10:52 AM, Michal Hocko wrote:
On Wed 24-04-19 00:43:01, Yang Shi wrote:
The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each vma") introduced THPeligible bit for processes' smaps. But, when checking
the eligibility for shmem vma, __transparent_hugepage_enabled() is
called to override the result from shmem_huge_enabled().  It may result
in the anonymous vma's THP flag override shmem's.  For example, running a simple test which create THP for shmem, but with anonymous THP disabled,
when reading the process's smaps, it may show:

7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test
Size:               4096 kB
...
[snip]
...
ShmemPmdMapped:     4096 kB
...
[snip]
...
THPeligible:    0

And, /proc/meminfo does show THP allocated and PMD mapped too:

ShmemHugePages:     4096 kB
ShmemPmdMapped:     4096 kB

This doesn't make too much sense.  The anonymous THP flag should not
intervene shmem THP.  Calling shmem_huge_enabled() with checking
MMF_DISABLE_THP sounds good enough.  And, we could skip stack and
dax vma check since we already checked if the vma is shmem already.
Kirill, can we get a confirmation that this is really intended behavior
rather than an omission please? Is this documented? What is a global
knob to simply disable THP system wise?

Hi Kirill,

Ping. Any comment?

Talked with Kirill at LSFMM, it sounds this is kind of intended behavior according to him. But, we all agree it looks inconsistent.

So, we may have two options:
    - Just fix the false negative issue as what the patch does
    - Change the behavior to make it more consistent

I'm not sure whether anyone relies on the behavior explicitly or implicitly or not.

If we would like to change the behavior, I may consider to take a step further to refactor the code a little bit to use huge_fault() to handle THP fault instead of falling back to handle_pte_fault() in the current implementation. This may make adding THP for other filesystems easier.


Thanks,
Yang


I have to say that the THP tuning API is one giant mess :/

Btw. this patch also seem to fix khugepaged behavior because it previously
ignored both VM_NOHUGEPAGE and MMF_DISABLE_THP.

Fixes: 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each vma")
Cc: Michal Hocko <mhocko@xxxxxxxx>
Cc: Vlastimil Babka <vbabka@xxxxxxx>
Cc: David Rientjes <rientjes@xxxxxxxxxx>
Cc: Kirill A. Shutemov <kirill@xxxxxxxxxxxxx>
Signed-off-by: Yang Shi <yang.shi@xxxxxxxxxxxxxxxxx>
---
v2: Check VM_NOHUGEPAGE per Michal Hocko

  mm/huge_memory.c | 4 ++--
  mm/shmem.c       | 3 +++
  2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 165ea46..5881e82 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -67,8 +67,8 @@ bool transparent_hugepage_enabled(struct vm_area_struct *vma)
  {
      if (vma_is_anonymous(vma))
          return __transparent_hugepage_enabled(vma);
-    if (vma_is_shmem(vma) && shmem_huge_enabled(vma))
-        return __transparent_hugepage_enabled(vma);
+    if (vma_is_shmem(vma))
+        return shmem_huge_enabled(vma);
        return false;
  }
diff --git a/mm/shmem.c b/mm/shmem.c
index 2275a0f..6f09a31 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3873,6 +3873,9 @@ bool shmem_huge_enabled(struct vm_area_struct *vma)
      loff_t i_size;
      pgoff_t off;
  +    if ((vma->vm_flags & VM_NOHUGEPAGE) ||
+        test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
+        return false;
      if (shmem_huge == SHMEM_HUGE_FORCE)
          return true;
      if (shmem_huge == SHMEM_HUGE_DENY)
--
1.8.3.1






[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux