Re: [PATCH 6/8] mm/memory-failure: Convert memory_failure() to use a folio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/15/2024 1:32 AM, Miaohe Lin wrote:

On 2024/3/13 9:23, Jane Chu wrote:
On 3/12/2024 7:14 AM, Matthew Wilcox wrote:

On Tue, Mar 12, 2024 at 03:07:39PM +0800, Miaohe Lin wrote:
On 2024/3/11 20:31, Matthew Wilcox wrote:
Assuming we have a refcount on this page so it can't be simultaneously
split/freed/whatever, these three sequences are equivalent:
If page is stable after page refcnt is held, I agree below three sequences are equivalent.

1    if (PageCompound(p))

2    struct page *head = compound_head(p);
2    if (PageHead(head))

3    struct folio *folio = page_folio(p);
3    if (folio_test_large(folio))

.

But please see below commit:

"""
commit f37d4298aa7f8b74395aa13c728677e2ed86fdaf
Author: Andi Kleen <ak@xxxxxxxxxxxxxxx>
Date:   Wed Aug 6 16:06:49 2014 -0700

      hwpoison: fix race with changing page during offlining

      When a hwpoison page is locked it could change state due to parallel
      modifications.  The original compound page can be torn down and then
      this 4k page becomes part of a differently-size compound page is is a
      standalone regular page.

      Check after the lock if the page is still the same compound page.
I can't speak to what the rules were ten years ago, but this is not
true now.  Compound pages cannot be split if you hold a refcount.
Since we don't track a per-page refcount, we wouldn't know which of
the split pages to give the excess refcount to.
I noticed this recently

  * GUP pin and PG_locked transferred to @page. Rest subpages can be freed if
  * they are not mapped.
  *
  * Returns 0 if the hugepage is split successfully.
  * Returns -EBUSY if the page is pinned or if anon_vma disappeared from under
  * us.
  */
int split_huge_page_to_list(struct page *page, struct list_head *list)
{

I have a test case with poisoned shmem THP page that was mlocked and

GUP pinned (FOLL_LONGTERM|FOLL_WRITE), but the split succeeded.
Can you elaborate your test case a little bit more detail? There is a check in split_huge_page_to_list():

/* Racy check whether the huge page can be split */
bool can_split_folio(struct folio *folio, int *pextra_pins)
{
	int extra_pins;

	/* Additional pins from page cache */
	if (folio_test_anon(folio))
		extra_pins = folio_test_swapcache(folio) ?
				folio_nr_pages(folio) : 0;
	else
		extra_pins = folio_nr_pages(folio);
	if (pextra_pins)
		*pextra_pins = extra_pins;
	return folio_mapcount(folio) == folio_ref_count(folio) - extra_pins - 1;
}

So a large folio can only be split if only one extra page refcnt is held. It means large folio won't be split from
under us if we hold an page refcnt. Or am I miss something?
My experiment was with an older kernel, though the can_split check is the same. Also, I was emulating GUP pin with a hack:  in madvise_inject_error(), replaced
get_user_pages_fast(start, 1, 0, &page) with
pin_user_pages_fast(start, 1, FOLL_WRITE|FOLL_LONGTERM, &page)
I suspect something might be wrong with my hack, I'm trying to reproduce with real GUP pin and on a newer kernel.
Will keep you informed.
thanks!
-jane



Thanks.

thanks,

-jane

.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux