Patch "mm: migrate: try again if THP split is failed due to page refcnt" has been added to the 6.1-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    mm: migrate: try again if THP split is failed due to page refcnt

to the 6.1-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     mm-migrate-try-again-if-thp-split-is-failed-due-to-p.patch
and it can be found in the queue-6.1 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 8834c3584cdce54f769ee76fdb530bc80d0689a4
Author: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
Date:   Mon Oct 24 16:34:22 2022 +0800

    mm: migrate: try again if THP split is failed due to page refcnt
    
    [ Upstream commit fd4a7ac32918d3d7a2d17dc06c5520f45e36eb52 ]
    
    When creating a virtual machine, we will use memfd_create() to get a file
    descriptor which can be used to create share memory mappings using the
    mmap function, meanwhile the mmap() will set the MAP_POPULATE flag to
    allocate physical pages for the virtual machine.
    
    When allocating physical pages for the guest, the host can fallback to
    allocate some CMA pages for the guest when over half of the zone's free
    memory is in the CMA area.
    
    In guest os, when the application wants to do some data transaction with
    DMA, our QEMU will call VFIO_IOMMU_MAP_DMA ioctl to do longterm-pin and
    create IOMMU mappings for the DMA pages.  However, when calling
    VFIO_IOMMU_MAP_DMA ioctl to pin the physical pages, we found it will be
    failed to longterm-pin sometimes.
    
    After some invetigation, we found the pages used to do DMA mapping can
    contain some CMA pages, and these CMA pages will cause a possible failure
    of the longterm-pin, due to failed to migrate the CMA pages.  The reason
    of migration failure may be temporary reference count or memory allocation
    failure.  So that will cause the VFIO_IOMMU_MAP_DMA ioctl returns error,
    which makes the application failed to start.
    
    I observed one migration failure case (which is not easy to reproduce) is
    that, the 'thp_migration_fail' count is 1 and the 'thp_split_page_failed'
    count is also 1.
    
    That means when migrating a THP which is in CMA area, but can not allocate
    a new THP due to memory fragmentation, so it will split the THP.  However
    THP split is also failed, probably the reason is temporary reference count
    of this THP.  And the temporary reference count can be caused by dropping
    page caches (I observed the drop caches operation in the system), but we
    can not drop the shmem page caches due to they are already dirty at that
    time.
    
    Especially for THP split failure, which is caused by temporary reference
    count, we can try again to mitigate the failure of migration in this case
    according to previous discussion [1].
    
    [1] https://lore.kernel.org/all/470dc638-a300-f261-94b4-e27250e42f96@xxxxxxxxxx/
    Link: https://lkml.kernel.org/r/6784730480a1df82e8f4cba1ed088e4ac767994b.1666599848.git.baolin.wang@xxxxxxxxxxxxxxxxx
    Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
    Reviewed-by: "Huang, Ying" <ying.huang@xxxxxxxxx>
    Cc: Alistair Popple <apopple@xxxxxxxxxx>
    Cc: David Hildenbrand <david@xxxxxxxxxx>
    Cc: Yang Shi <shy828301@xxxxxxxxx>
    Cc: Zi Yan <ziy@xxxxxxxxxx>
    Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
    Stable-dep-of: 35e41024c4c2 ("vmscan,migrate: fix page count imbalance on node stats when demoting pages")
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 98a1a05f2db2d..f53bc54dacb37 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2728,7 +2728,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 	 * split PMDs
 	 */
 	if (!can_split_folio(folio, &extra_pins)) {
-		ret = -EBUSY;
+		ret = -EAGAIN;
 		goto out_unlock;
 	}
 
@@ -2780,7 +2780,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 			xas_unlock(&xas);
 		local_irq_enable();
 		remap_page(folio, folio_nr_pages(folio));
-		ret = -EBUSY;
+		ret = -EAGAIN;
 	}
 
 out_unlock:
diff --git a/mm/migrate.c b/mm/migrate.c
index 0252aa4ff572e..b0caa89e67d5f 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1518,9 +1518,22 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 				if (is_thp) {
 					nr_thp_failed++;
 					/* THP NUMA faulting doesn't split THP to retry. */
-					if (!nosplit && !try_split_thp(page, &thp_split_pages)) {
-						nr_thp_split++;
-						break;
+					if (!nosplit) {
+						int ret = try_split_thp(page, &thp_split_pages);
+
+						if (!ret) {
+							nr_thp_split++;
+							break;
+						} else if (reason == MR_LONGTERM_PIN &&
+							   ret == -EAGAIN) {
+							/*
+							 * Try again to split THP to mitigate
+							 * the failure of longterm pinning.
+							 */
+							thp_retry++;
+							nr_retry_pages += nr_subpages;
+							break;
+						}
 					}
 				} else if (!no_subpage_counting) {
 					nr_failed++;




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux