Re: Splitting pinned folios

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 13.03.24 04:16, Matthew Wilcox wrote:
On Tue, Mar 12, 2024 at 06:23:43PM -0700, Jane Chu wrote:
I noticed this recently

OK, this is entirely different, so I'm going to start a new thread ;-)

  * GUP pin and PG_locked transferred to @page. Rest subpages can be freed if
  * they are not mapped.
  *
  * Returns 0 if the hugepage is split successfully.
  * Returns -EBUSY if the page is pinned or if anon_vma disappeared from under
  * us.
  */
int split_huge_page_to_list(struct page *page, struct list_head *list)
{

I have a test case with poisoned shmem THP page that was mlocked and

GUP pinned (FOLL_LONGTERM|FOLL_WRITE), but the split succeeded.

I'm going to blame John for this!

The description is wrong. Whoever calls split_huge_page_to_list() must hold a folio reference.

That folio reference will be transferred to @page (not the head page) once split. So @page can be used by the caller after the split succeeded.


There's no reference to pincount
anywhere in huge_memory.c, so I have no clue how this comment is even

Each pincount increment/decrement must be paired with a folio refcount increment. Therefore, no pincount checks are required.

close to true, nor do I understand how it could be done, since we don't
know which pages in a folio are pinned.

As the description correctly says: "Returns -EBUSY if the page is pinned".

If that is not true, we'd have a real issue.


I think we have to prohibit splits of folios that are GUP pinned.


In split_huge_page_to_list(), we make sure there are no additional folio references of any kind (GUP pin, whatsoever).

can_split_folio() is racy but catches most of that. Then, we do the folio_ref_freeze().

So I don't see how that could ever work with additional folio references (including GUP pins). Unless serious BUG somewhere else.

In essence: we expect on a folio after completely unmapping it:
* 1 reference from the caller of split_huge_page_to_list()
* pagecache: 1 reference per subpage from the pagecache
* anon: 1 reference per subage from the swapcache if in the swapcache

Any additional reference would lead to a split failure.

We're holding the folio lock, so for anon folios we cannot remove it from the swapcache concurrently.

For pagecache folios ... dunno :) I expect some folio-lock magic as well.

Reading "I have a test case with poisoned shmem THP page that was mlocked and GUP pinned (FOLL_LONGTERM|FOLL_WRITE), but the split succeeded."

If that is indeed true, I assume that page poisoning might have done something very wrong with the large folio: for example, partially unmap it from the pagecache (if that's even possible?) or accidentally drop a folio reference. Then, we'd be missing to detecting the GUP pin when freezing the refcount.

... any chance we can get the reproducer? [reading this mail from Willy only]

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux