[PATCH 0/5] Remove some races around folio_test_hugetlb

"Matthew Wilcox (Oracle)" <willy@xxxxxxxxxxxxx> · Fri, 1 Mar 2024 21:47:05 +0000

Oscar and I have been exchanging a bit of email recently about the
bug reported here:
https://lore.kernel.org/all/ZXNhGsX32y19a2Xv@xxxxxxxxxxxxxxxxxxxx

I've come to the conclusion that folio_test_hugetlb() is just too fragile
as it can give both false positives and false negatives, as well as
resulting in the above bug.  With this patch series, it becomes a lot
more robust.  In the memory-failure case, we always hold the hugetlb_lock
so it's perfectly reliable.  In the compaction caase, it's unreliable, but
the failures are acceptable and we recheck after taking the hugetlb_lock.

The cost of this reliability is that we now consume the word I recently
freed in folio->page[1].  I think this is acceptable; we've still gained
a completely reliable folio_test_hugetlb() (which we didn't have before
I started messing around with the folio dtors).  Non-hugetlb users
can use large_id as a pointer to something else entirely, or even as a
non-pointer, as long as they can guarantee it can't conflict (ie don't
use it as a bitfield).

So far, this is working for me.  Some stress testing would be appreciated.

Matthew Wilcox (Oracle) (5):
  hugetlb: Make folio_test_hugetlb safer to call
  hugetlb: Add hugetlb_pfn_folio
  memory-failure: Use hugetlb_pfn_folio
  memory-failure: Reorganise get_huge_page_for_hwpoison()
  compaction: Use hugetlb_pfn_folio in isolate_migratepages_block

 include/linux/hugetlb.h    | 13 ++-----
 include/linux/mm.h         |  8 -----
 include/linux/mm_types.h   |  4 ++-
 include/linux/page-flags.h | 25 +++----------
 kernel/vmcore_info.c       |  3 +-
 mm/compaction.c            | 16 ++++-----
 mm/huge_memory.c           | 10 ++----
 mm/hugetlb.c               | 72 +++++++++++++++++++++++++++++---------
 mm/memory-failure.c        | 14 +++++---
 9 files changed, 87 insertions(+), 78 deletions(-)

-- 
2.43.0