Re: [RFC PATCH 1/3] mm: migrate: Fix the return value of migrate_pages()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5 Nov 2021, at 6:17, Baolin Wang wrote:

> On 2021/11/4 23:33, Zi Yan wrote:
>> On 3 Nov 2021, at 6:51, Baolin Wang wrote:
>>
>>> As Zi Yan pointed out, the syscall move_pages() can return a non-migrated
>>> number larger than the number of pages the users tried to migrate, when a
>>> THP page is failed to migrate. This is confusing for users.
>>>
>>> Since other migration scenarios do not care about the actual non-migrated
>>> number of pages except the memory compaction migration which will fix in
>>> following patch. Thus we can change the return value to return the number
>>> of {normal page, THP, hugetlb} instead to avoid this issue, meanwhile we
>>> should still keep the migration counters using the number of normal pages.
>>>
>>> Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
>>> ---
>>>   mm/migrate.c | 18 ++++++++++--------
>>>   1 file changed, 10 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>> index a11e948..00b8922 100644
>>> --- a/mm/migrate.c
>>> +++ b/mm/migrate.c
>>> @@ -1428,7 +1428,7 @@ static inline int try_split_thp(struct page *page, struct page **page2,
>>>    * @mode:		The migration mode that specifies the constraints for
>>>    *			page migration, if any.
>>>    * @reason:		The reason for page migration.
>>> - * @ret_succeeded:	Set to the number of pages migrated successfully if
>>> + * @ret_succeeded:	Set to the number of normal pages migrated successfully if
>>>    *			the caller passes a non-NULL pointer.
>>>    *
>>>    * The function returns after 10 attempts or if no pages are movable any more
>>> @@ -1436,7 +1436,7 @@ static inline int try_split_thp(struct page *page, struct page **page2,
>>>    * It is caller's responsibility to call putback_movable_pages() to return pages
>>>    * to the LRU or free list only if ret != 0.
>>>    *
>>> - * Returns the number of pages that were not migrated, or an error code.
>>> + * Returns the number of {normal page, THP} that were not migrated, or an error code.
>>>    */
>>>   int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>>   		free_page_t put_new_page, unsigned long private,
>>> @@ -1445,6 +1445,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>>   	int retry = 1;
>>>   	int thp_retry = 1;
>>>   	int nr_failed = 0;
>>> +	int nr_failed_pages = 0;
>>>   	int nr_succeeded = 0;
>>>   	int nr_thp_succeeded = 0;
>>>   	int nr_thp_failed = 0;
>>> @@ -1517,7 +1518,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>>   					}
>>>
>>>   					nr_thp_failed++;
>>> -					nr_failed += nr_subpages;
>>> +					nr_failed_pages += nr_subpages;
>>>   					break;
>>>   				}
>>>
>>> @@ -1537,7 +1538,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>>   					}
>>>
>>>   					nr_thp_failed++;
>>> -					nr_failed += nr_subpages;
>>> +					nr_failed_pages += nr_subpages;
>>>   					goto out;
>>>   				}
>>>   				nr_failed++;
>>> @@ -1566,7 +1567,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>>   				 */
>>>   				if (is_thp) {
>>>   					nr_thp_failed++;
>>> -					nr_failed += nr_subpages;
>>> +					nr_failed_pages += nr_subpages;
>>>   					break;
>>>   				}
>>>   				nr_failed++;
>>> @@ -1575,8 +1576,9 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>>   		}
>>>   	}
>>>   	nr_failed += retry + thp_retry;
>>
>> This line can probably go away, since we do not want to count retried pages.
>
> OK.

My bad, I misread the code. This should stay, since each -EAGIN does not
increase nr_failed or nr_thp_failed and after the for loop, retry and thp_retry
give the number of pages that fail to migrate after 10 retries.

>
>>
>>> +	nr_failed_pages += nr_failed;
>>>   	nr_thp_failed += thp_retry;
>>> -	rc = nr_failed;
>>> +	rc = nr_failed + nr_thp_failed;
>>>   out:
>>>   	/*
>>>   	 * Put the permanent failure page back to migration list, they
>>> @@ -1585,11 +1587,11 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>>   	list_splice(&ret_pages, from);
>>>
>>>   	count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
>>> -	count_vm_events(PGMIGRATE_FAIL, nr_failed);
>>> +	count_vm_events(PGMIGRATE_FAIL, nr_failed_pages);
>>>   	count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
>>>   	count_vm_events(THP_MIGRATION_FAIL, nr_thp_failed);
>>>   	count_vm_events(THP_MIGRATION_SPLIT, nr_thp_split);
>>> -	trace_mm_migrate_pages(nr_succeeded, nr_failed, nr_thp_succeeded,
>>> +	trace_mm_migrate_pages(nr_succeeded, nr_failed_pages, nr_thp_succeeded,
>>>   			       nr_thp_failed, nr_thp_split, mode, reason);
>>>
>>>   	if (!swapwrite)
>>> -- 
>>> 1.8.3.1
>>
>> Thank you for the patch!
>>
>> In general, this looks good to me. But like you said in other email, when a THP fails to
>> migrate and gets split, the number of nr_failed will still be inflated by the number of
>> failed subpage migrations. What I can think of is to split THPs to a separate list and
>> stop increasing nr_failed when the pages from the new list is under migration. Let me
>> know how it sounds to you.
>
> Thanks for your patch, but I think it does not cover all the cases.
>
> Firstly, what confuses me is that if we return 1 to users when failed to migrate 1 THP page, but actually we may have migrated some normal pages successfaully if the THP page is split. Anyway we can add some comments for migrate_pages() to explain it if this is acceptable.

Sure.

>
> Another concern about your patch is that, if the first round migration all pages are migrated successfaully (nr_failed = 0), but if failed to migrate the subpages of the THP in the second round, we will still return 0 to users, which is incorrect. Further more, if the

Ah, I missed this. We should increase nr_thp_failed when THP is split.

> subpages of the THP are migrated partially in the second round, what the number of non-migrated should be returned?  Suppose multiple THP pages have been split?

Assuming users do not know/care what kinds of pages are in their migration list, we can
just return 1 if a THP is split, no matter how many subpages are migrated successfully.
If they do want to know the details, they probably can check the tracepoints.

>
> Last concern is that, we will try to migrate subpages of the THP in the second round, but if some non-migrated pages are still remained in the 'from' list, we will do another redundant migration for these failed-migration pages and no failed counting for them. We can fix this issue by moving the remained non-migrated pages to 'ret_pages' list, which will put back to 'from' list when returning from this function.
>
> if (!list_empty(&thp_split_pages)) {
> +	list_splice(from, &ret_pages);
> 	list_splice(&thp_split_pages, from);
> 	no_failed_counting = true;
> 	goto thp_subpage_migration;
> }

You are right. At this point, pages in the from list have been retried 10 times,
no need to migrate them again.

>
> The concern 2 is more complicated, or we can just use 'nr_thp_split' to return if some subpages are failed to be migrated to simplify the case, no matter how many subpages are failed?
>
> rc = nr_failed + nr_thp_failed + nr_thp_split;

Like I said above, I would just increase nr_thp_failed when a THP is split. Here is the patch
I come up with based on your feedback above. This time I incorporated your patch above. Please
let me know how it looks. Thanks.

diff --git a/mm/migrate.c b/mm/migrate.c
index 1852d787e6ab..a5fad22259a3 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1429,7 +1429,7 @@ static inline int try_split_thp(struct page *page, struct page **page2,
  * @mode:              The migration mode that specifies the constraints for
  *                     page migration, if any.
  * @reason:            The reason for page migration.
- * @ret_succeeded:     Set to the number of pages migrated successfully if
+ * @ret_succeeded:     Set to the number of normal pages migrated successfully if
  *                     the caller passes a non-NULL pointer.
  *
  * The function returns after 10 attempts or if no pages are movable any more
@@ -1437,7 +1437,7 @@ static inline int try_split_thp(struct page *page, struct page **page2,
  * It is caller's responsibility to call putback_movable_pages() to return pages
  * to the LRU or free list only if ret != 0.
  *
- * Returns the number of pages that were not migrated, or an error code.
+ * Returns the number of {normal page, THP} that were not migrated, or an error code.
  */
 int migrate_pages(struct list_head *from, new_page_t get_new_page,
                free_page_t put_new_page, unsigned long private,
@@ -1446,6 +1446,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
        int retry = 1;
        int thp_retry = 1;
        int nr_failed = 0;
+       int nr_failed_pages = 0;
        int nr_succeeded = 0;
        int nr_thp_succeeded = 0;
        int nr_thp_failed = 0;
@@ -1457,13 +1458,16 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
        int swapwrite = current->flags & PF_SWAPWRITE;
        int rc, nr_subpages;
        LIST_HEAD(ret_pages);
+       LIST_HEAD(thp_split_pages);
        bool nosplit = (reason == MR_NUMA_MISPLACED);
+       bool dont_count_failed_subpage = false;

        trace_mm_migrate_pages_start(mode, reason);

        if (!swapwrite)
                current->flags |= PF_SWAPWRITE;

+thp_subpage_migration:
        for (pass = 0; pass < 10 && (retry || thp_retry); pass++) {
                retry = 0;
                thp_retry = 0;
@@ -1512,18 +1516,21 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
                        case -ENOSYS:
                                /* THP migration is unsupported */
                                if (is_thp) {
-                                       if (!try_split_thp(page, &page2, from)) {
+                                       nr_thp_failed++;
+                                       if (!try_split_thp(page, &page2, &thp_split_pages)) {
                                                nr_thp_split++;
                                                goto retry;
                                        }

-                                       nr_thp_failed++;
-                                       nr_failed += nr_subpages;
+                                       nr_failed_pages += nr_subpages;
                                        break;
                                }

                                /* Hugetlb migration is unsupported */
-                               nr_failed++;
+                               if (!dont_count_failed_subpage) {
+                                       nr_failed++;
+                                       nr_failed_pages++;
+                               }
                                break;
                        case -ENOMEM:
                                /*
@@ -1532,16 +1539,19 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
                                 * THP NUMA faulting doesn't split THP to retry.
                                 */
                                if (is_thp && !nosplit) {
-                                       if (!try_split_thp(page, &page2, from)) {
+                                       nr_thp_failed++;
+                                       if (!try_split_thp(page, &page2, &thp_split_pages)) {
                                                nr_thp_split++;
                                                goto retry;
                                        }

-                                       nr_thp_failed++;
-                                       nr_failed += nr_subpages;
+                                       nr_failed_pages += nr_subpages;
                                        goto out;
                                }
-                               nr_failed++;
+                               if (!dont_count_failed_subpage) {
+                                       nr_failed++;
+                                       nr_failed_pages++;
+                               }
                                goto out;
                        case -EAGAIN:
                                if (is_thp) {
@@ -1567,17 +1577,34 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
                                 */
                                if (is_thp) {
                                        nr_thp_failed++;
-                                       nr_failed += nr_subpages;
+                                       nr_failed_pages += nr_subpages;
                                        break;
                                }
-                               nr_failed++;
+                               if (!dont_count_failed_subpage) {
+                                       nr_failed++;
+                                       nr_failed_pages++;
+                               }
                                break;
                        }
                }
        }
        nr_failed += retry + thp_retry;
        nr_thp_failed += thp_retry;
-       rc = nr_failed;
+       /*
+        * try to migrate subpages of fail-to-migrate THPs, no nr_failed
+        * counting in this round, since all subpages of a THP is counted as
+        * 1 failure in the first round.
+        */
+       if (!list_empty(&thp_split_pages)) {
+               /*
+                * move non-migrated pages (after 10 retries) to ret_pages to
+                * avoid migrating them again.
+                */
+               list_splice(from, &ret_pages);
+               list_splice(&thp_split_pages, from);
+               dont_count_failed_subpage = true;
+               goto thp_subpage_migration;
+       }
+
+       rc = nr_failed + nr_thp_failed;
 out:
        /*
         * Put the permanent failure page back to migration list, they
@@ -1586,11 +1613,11 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
        list_splice(&ret_pages, from);

        count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
-       count_vm_events(PGMIGRATE_FAIL, nr_failed);
+       count_vm_events(PGMIGRATE_FAIL, nr_failed_pages);
        count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
        count_vm_events(THP_MIGRATION_FAIL, nr_thp_failed);
        count_vm_events(THP_MIGRATION_SPLIT, nr_thp_split);
-       trace_mm_migrate_pages(nr_succeeded, nr_failed, nr_thp_succeeded,
+       trace_mm_migrate_pages(nr_succeeded, nr_failed_pages, nr_thp_succeeded,
                               nr_thp_failed, nr_thp_split, mode, reason);

        if (!swapwrite)

--
Best Regards,
Yan, Zi

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux