Re: [PATCH] mm/memory-failure: remove shake_page()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/26/24 11:27 AM, Matthew Wilcox wrote:
On Fri, Apr 26, 2024 at 10:57:31AM -0700, Sidhartha Kumar wrote:
On 4/26/24 10:34 AM, Matthew Wilcox wrote:
On Fri, Apr 26, 2024 at 10:15:11AM -0700, Sidhartha Kumar wrote:
Use a folio in get_any_page() to save 5 calls to compound head and
convert the last user of shake_page() to shake_folio(). This allows us
to remove the shake_page() definition.

So I didn't do this before because I wasn't convinced it was safe.
We don't have a refcount on the folio, so the page might no longer
be part of this folio by the time we get the refcount on the folio.

I'd really like to see some argumentation for why this is safe.

If I moved down the folio = page_folio() line to after we verify
__get_hwpoison_page() has returned 1, which indicates the reference count
was successfully incremented via foliO_try_get(), that means the folio
conversion would happen after we have a refcount. In the case we don't call
__get_hwpoison_page(), that means the MF_COUNT_INCREASED flag is set. This
means the page has existing users so that path would be safe as well. So I
think this is safe after moving page_folio() after __get_hwpoison_page().

See if you can find a hole in this chain of reasoning ...

memory_failure()
         p = pfn_to_online_page(pfn);
         res = try_memory_failure_hugetlb(pfn, flags, &hugetlb);
(not a hugetlb)
         if (TestSetPageHWPoison(p)) {
(not already poisoned)
         if (!(flags & MF_COUNT_INCREASED)) {
                 res = get_hwpoison_page(p, flags);

get_hwpoison_page()
                 ret = get_any_page(p, flags);

get_any_page()
	folio = page_folio(page)

That would be unsafe, the safe way would be if we moved page_folio() after the call to __get_hw_poison() in get_any_page() and there would still be one remaining user of shake_page() that we can't convert. A safe version of this patch would result in a removal of one use of PageHuge() and two uses of put_page(), would that be worth submitting?

get_any_page()
	if(__get_hwpoison_page())
		folio = page_folio() /* folio_try_get() returned 1, safe */


Because we don't have a reference on the folio at this point (how could
we?), the folio might be split, and now we have a pointer to a folio
which no longer contains the page (assuming we had a hwerror in what
was a tail page at this time).





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux