Re: [PATCH v13 2/7] mm: factor helpers for memory_failure_dev_pagemap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





在 2022/4/21 14:13, HORIGUCHI NAOYA(堀口 直也) 写道:
On Tue, Apr 19, 2022 at 12:50:40PM +0800, Shiyang Ruan wrote:
memory_failure_dev_pagemap code is a bit complex before introduce RMAP
feature for fsdax.  So it is needed to factor some helper functions to
simplify these code.

Signed-off-by: Shiyang Ruan <ruansy.fnst@xxxxxxxxxxx>
Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx>
Reviewed-by: Christoph Hellwig <hch@xxxxxx>
Reviewed-by: Dan Williams <dan.j.williams@xxxxxxxxx>

Thanks for the refactoring.  As I commented to 0/7, the conflict with
"mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb()"
can be trivially resolved.

Another few comment below ...

---
  mm/memory-failure.c | 157 ++++++++++++++++++++++++--------------------
  1 file changed, 87 insertions(+), 70 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index e3fbff5bd467..7c8c047bfdc8 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1498,6 +1498,90 @@ static int try_to_split_thp_page(struct page *page, const char *msg)
  	return 0;
  }

+static void unmap_and_kill(struct list_head *to_kill, unsigned long pfn,
+		struct address_space *mapping, pgoff_t index, int flags)
+{
+	struct to_kill *tk;
+	unsigned long size = 0;
+
+	list_for_each_entry(tk, to_kill, nd)
+		if (tk->size_shift)
+			size = max(size, 1UL << tk->size_shift);
+
+	if (size) {
+		/*
+		 * Unmap the largest mapping to avoid breaking up device-dax
+		 * mappings which are constant size. The actual size of the
+		 * mapping being torn down is communicated in siginfo, see
+		 * kill_proc()
+		 */
+		loff_t start = (index << PAGE_SHIFT) & ~(size - 1);
+
+		unmap_mapping_range(mapping, start, size, 0);
+	}
+
+	kill_procs(to_kill, flags & MF_MUST_KILL, false, pfn, flags);
+}
+
+static int mf_generic_kill_procs(unsigned long long pfn, int flags,
+		struct dev_pagemap *pgmap)
+{
+	struct page *page = pfn_to_page(pfn);
+	LIST_HEAD(to_kill);
+	dax_entry_t cookie;
+	int rc = 0;
+
+	/*
+	 * Pages instantiated by device-dax (not filesystem-dax)
+	 * may be compound pages.
+	 */
+	page = compound_head(page);
+
+	/*
+	 * Prevent the inode from being freed while we are interrogating
+	 * the address_space, typically this would be handled by
+	 * lock_page(), but dax pages do not use the page lock. This
+	 * also prevents changes to the mapping of this pfn until
+	 * poison signaling is complete.
+	 */
+	cookie = dax_lock_page(page);
+	if (!cookie)
+		return -EBUSY;
+
+	if (hwpoison_filter(page)) {
+		rc = -EOPNOTSUPP;
+		goto unlock;
+	}
+
+	if (pgmap->type == MEMORY_DEVICE_PRIVATE) {
+		/*
+		 * TODO: Handle HMM pages which may need coordination
+		 * with device-side memory.
+		 */
+		return -EBUSY;

Don't we need to go to dax_unlock_page() as the origincal code do?

+	}
+
+	/*
+	 * Use this flag as an indication that the dax page has been
+	 * remapped UC to prevent speculative consumption of poison.
+	 */
+	SetPageHWPoison(page);
+
+	/*
+	 * Unlike System-RAM there is no possibility to swap in a
+	 * different physical page at a given virtual address, so all
+	 * userspace consumption of ZONE_DEVICE memory necessitates
+	 * SIGBUS (i.e. MF_MUST_KILL)
+	 */
+	flags |= MF_ACTION_REQUIRED | MF_MUST_KILL;
+	collect_procs(page, &to_kill, true);
+
+	unmap_and_kill(&to_kill, pfn, page->mapping, page->index, flags);
+unlock:
+	dax_unlock_page(page, cookie);
+	return rc;
+}
+
  /*
   * Called from hugetlb code with hugetlb_lock held.
   *
@@ -1644,12 +1728,8 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
  		struct dev_pagemap *pgmap)
  {
  	struct page *page = pfn_to_page(pfn);
-	unsigned long size = 0;
-	struct to_kill *tk;
  	LIST_HEAD(tokill);

Is this variable unused in this function?

Yes, this one and the one above are mistakes I didn't notice when I resolving conflicts with the newer next- branch. I'll fix them in next version.


--
Thanks,
Ruan.


Thanks,
Naoya Horiguchi





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux