On 5 Nov 2021, at 6:17, Baolin Wang wrote: > On 2021/11/4 23:33, Zi Yan wrote: >> On 3 Nov 2021, at 6:51, Baolin Wang wrote: >> >>> As Zi Yan pointed out, the syscall move_pages() can return a non-migrated >>> number larger than the number of pages the users tried to migrate, when a >>> THP page is failed to migrate. This is confusing for users. >>> >>> Since other migration scenarios do not care about the actual non-migrated >>> number of pages except the memory compaction migration which will fix in >>> following patch. Thus we can change the return value to return the number >>> of {normal page, THP, hugetlb} instead to avoid this issue, meanwhile we >>> should still keep the migration counters using the number of normal pages. >>> >>> Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> >>> --- >>> mm/migrate.c | 18 ++++++++++-------- >>> 1 file changed, 10 insertions(+), 8 deletions(-) >>> >>> diff --git a/mm/migrate.c b/mm/migrate.c >>> index a11e948..00b8922 100644 >>> --- a/mm/migrate.c >>> +++ b/mm/migrate.c >>> @@ -1428,7 +1428,7 @@ static inline int try_split_thp(struct page *page, struct page **page2, >>> * @mode: The migration mode that specifies the constraints for >>> * page migration, if any. >>> * @reason: The reason for page migration. >>> - * @ret_succeeded: Set to the number of pages migrated successfully if >>> + * @ret_succeeded: Set to the number of normal pages migrated successfully if >>> * the caller passes a non-NULL pointer. >>> * >>> * The function returns after 10 attempts or if no pages are movable any more >>> @@ -1436,7 +1436,7 @@ static inline int try_split_thp(struct page *page, struct page **page2, >>> * It is caller's responsibility to call putback_movable_pages() to return pages >>> * to the LRU or free list only if ret != 0. >>> * >>> - * Returns the number of pages that were not migrated, or an error code. >>> + * Returns the number of {normal page, THP} that were not migrated, or an error code. >>> */ >>> int migrate_pages(struct list_head *from, new_page_t get_new_page, >>> free_page_t put_new_page, unsigned long private, >>> @@ -1445,6 +1445,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >>> int retry = 1; >>> int thp_retry = 1; >>> int nr_failed = 0; >>> + int nr_failed_pages = 0; >>> int nr_succeeded = 0; >>> int nr_thp_succeeded = 0; >>> int nr_thp_failed = 0; >>> @@ -1517,7 +1518,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >>> } >>> >>> nr_thp_failed++; >>> - nr_failed += nr_subpages; >>> + nr_failed_pages += nr_subpages; >>> break; >>> } >>> >>> @@ -1537,7 +1538,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >>> } >>> >>> nr_thp_failed++; >>> - nr_failed += nr_subpages; >>> + nr_failed_pages += nr_subpages; >>> goto out; >>> } >>> nr_failed++; >>> @@ -1566,7 +1567,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >>> */ >>> if (is_thp) { >>> nr_thp_failed++; >>> - nr_failed += nr_subpages; >>> + nr_failed_pages += nr_subpages; >>> break; >>> } >>> nr_failed++; >>> @@ -1575,8 +1576,9 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >>> } >>> } >>> nr_failed += retry + thp_retry; >> >> This line can probably go away, since we do not want to count retried pages. > > OK. My bad, I misread the code. This should stay, since each -EAGIN does not increase nr_failed or nr_thp_failed and after the for loop, retry and thp_retry give the number of pages that fail to migrate after 10 retries. > >> >>> + nr_failed_pages += nr_failed; >>> nr_thp_failed += thp_retry; >>> - rc = nr_failed; >>> + rc = nr_failed + nr_thp_failed; >>> out: >>> /* >>> * Put the permanent failure page back to migration list, they >>> @@ -1585,11 +1587,11 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >>> list_splice(&ret_pages, from); >>> >>> count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded); >>> - count_vm_events(PGMIGRATE_FAIL, nr_failed); >>> + count_vm_events(PGMIGRATE_FAIL, nr_failed_pages); >>> count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded); >>> count_vm_events(THP_MIGRATION_FAIL, nr_thp_failed); >>> count_vm_events(THP_MIGRATION_SPLIT, nr_thp_split); >>> - trace_mm_migrate_pages(nr_succeeded, nr_failed, nr_thp_succeeded, >>> + trace_mm_migrate_pages(nr_succeeded, nr_failed_pages, nr_thp_succeeded, >>> nr_thp_failed, nr_thp_split, mode, reason); >>> >>> if (!swapwrite) >>> -- >>> 1.8.3.1 >> >> Thank you for the patch! >> >> In general, this looks good to me. But like you said in other email, when a THP fails to >> migrate and gets split, the number of nr_failed will still be inflated by the number of >> failed subpage migrations. What I can think of is to split THPs to a separate list and >> stop increasing nr_failed when the pages from the new list is under migration. Let me >> know how it sounds to you. > > Thanks for your patch, but I think it does not cover all the cases. > > Firstly, what confuses me is that if we return 1 to users when failed to migrate 1 THP page, but actually we may have migrated some normal pages successfaully if the THP page is split. Anyway we can add some comments for migrate_pages() to explain it if this is acceptable. Sure. > > Another concern about your patch is that, if the first round migration all pages are migrated successfaully (nr_failed = 0), but if failed to migrate the subpages of the THP in the second round, we will still return 0 to users, which is incorrect. Further more, if the Ah, I missed this. We should increase nr_thp_failed when THP is split. > subpages of the THP are migrated partially in the second round, what the number of non-migrated should be returned? Suppose multiple THP pages have been split? Assuming users do not know/care what kinds of pages are in their migration list, we can just return 1 if a THP is split, no matter how many subpages are migrated successfully. If they do want to know the details, they probably can check the tracepoints. > > Last concern is that, we will try to migrate subpages of the THP in the second round, but if some non-migrated pages are still remained in the 'from' list, we will do another redundant migration for these failed-migration pages and no failed counting for them. We can fix this issue by moving the remained non-migrated pages to 'ret_pages' list, which will put back to 'from' list when returning from this function. > > if (!list_empty(&thp_split_pages)) { > + list_splice(from, &ret_pages); > list_splice(&thp_split_pages, from); > no_failed_counting = true; > goto thp_subpage_migration; > } You are right. At this point, pages in the from list have been retried 10 times, no need to migrate them again. > > The concern 2 is more complicated, or we can just use 'nr_thp_split' to return if some subpages are failed to be migrated to simplify the case, no matter how many subpages are failed? > > rc = nr_failed + nr_thp_failed + nr_thp_split; Like I said above, I would just increase nr_thp_failed when a THP is split. Here is the patch I come up with based on your feedback above. This time I incorporated your patch above. Please let me know how it looks. Thanks. diff --git a/mm/migrate.c b/mm/migrate.c index 1852d787e6ab..a5fad22259a3 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1429,7 +1429,7 @@ static inline int try_split_thp(struct page *page, struct page **page2, * @mode: The migration mode that specifies the constraints for * page migration, if any. * @reason: The reason for page migration. - * @ret_succeeded: Set to the number of pages migrated successfully if + * @ret_succeeded: Set to the number of normal pages migrated successfully if * the caller passes a non-NULL pointer. * * The function returns after 10 attempts or if no pages are movable any more @@ -1437,7 +1437,7 @@ static inline int try_split_thp(struct page *page, struct page **page2, * It is caller's responsibility to call putback_movable_pages() to return pages * to the LRU or free list only if ret != 0. * - * Returns the number of pages that were not migrated, or an error code. + * Returns the number of {normal page, THP} that were not migrated, or an error code. */ int migrate_pages(struct list_head *from, new_page_t get_new_page, free_page_t put_new_page, unsigned long private, @@ -1446,6 +1446,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, int retry = 1; int thp_retry = 1; int nr_failed = 0; + int nr_failed_pages = 0; int nr_succeeded = 0; int nr_thp_succeeded = 0; int nr_thp_failed = 0; @@ -1457,13 +1458,16 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, int swapwrite = current->flags & PF_SWAPWRITE; int rc, nr_subpages; LIST_HEAD(ret_pages); + LIST_HEAD(thp_split_pages); bool nosplit = (reason == MR_NUMA_MISPLACED); + bool dont_count_failed_subpage = false; trace_mm_migrate_pages_start(mode, reason); if (!swapwrite) current->flags |= PF_SWAPWRITE; +thp_subpage_migration: for (pass = 0; pass < 10 && (retry || thp_retry); pass++) { retry = 0; thp_retry = 0; @@ -1512,18 +1516,21 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, case -ENOSYS: /* THP migration is unsupported */ if (is_thp) { - if (!try_split_thp(page, &page2, from)) { + nr_thp_failed++; + if (!try_split_thp(page, &page2, &thp_split_pages)) { nr_thp_split++; goto retry; } - nr_thp_failed++; - nr_failed += nr_subpages; + nr_failed_pages += nr_subpages; break; } /* Hugetlb migration is unsupported */ - nr_failed++; + if (!dont_count_failed_subpage) { + nr_failed++; + nr_failed_pages++; + } break; case -ENOMEM: /* @@ -1532,16 +1539,19 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, * THP NUMA faulting doesn't split THP to retry. */ if (is_thp && !nosplit) { - if (!try_split_thp(page, &page2, from)) { + nr_thp_failed++; + if (!try_split_thp(page, &page2, &thp_split_pages)) { nr_thp_split++; goto retry; } - nr_thp_failed++; - nr_failed += nr_subpages; + nr_failed_pages += nr_subpages; goto out; } - nr_failed++; + if (!dont_count_failed_subpage) { + nr_failed++; + nr_failed_pages++; + } goto out; case -EAGAIN: if (is_thp) { @@ -1567,17 +1577,34 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, */ if (is_thp) { nr_thp_failed++; - nr_failed += nr_subpages; + nr_failed_pages += nr_subpages; break; } - nr_failed++; + if (!dont_count_failed_subpage) { + nr_failed++; + nr_failed_pages++; + } break; } } } nr_failed += retry + thp_retry; nr_thp_failed += thp_retry; - rc = nr_failed; + /* + * try to migrate subpages of fail-to-migrate THPs, no nr_failed + * counting in this round, since all subpages of a THP is counted as + * 1 failure in the first round. + */ + if (!list_empty(&thp_split_pages)) { + /* + * move non-migrated pages (after 10 retries) to ret_pages to + * avoid migrating them again. + */ + list_splice(from, &ret_pages); + list_splice(&thp_split_pages, from); + dont_count_failed_subpage = true; + goto thp_subpage_migration; + } + + rc = nr_failed + nr_thp_failed; out: /* * Put the permanent failure page back to migration list, they @@ -1586,11 +1613,11 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, list_splice(&ret_pages, from); count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded); - count_vm_events(PGMIGRATE_FAIL, nr_failed); + count_vm_events(PGMIGRATE_FAIL, nr_failed_pages); count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded); count_vm_events(THP_MIGRATION_FAIL, nr_thp_failed); count_vm_events(THP_MIGRATION_SPLIT, nr_thp_split); - trace_mm_migrate_pages(nr_succeeded, nr_failed, nr_thp_succeeded, + trace_mm_migrate_pages(nr_succeeded, nr_failed_pages, nr_thp_succeeded, nr_thp_failed, nr_thp_split, mode, reason); if (!swapwrite) -- Best Regards, Yan, Zi
Attachment:
signature.asc
Description: OpenPGP digital signature