Re: [PATCH 1/2] mm/migrate.c: Fix return code when migration fails

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Alistair Popple <apopple@xxxxxxxxxx> writes:

> Michal Hocko <mhocko@xxxxxxxx> writes:
>
>> On Mon 07-08-23 22:31:52, Alistair Popple wrote:
>>> 
>>> Michal Hocko <mhocko@xxxxxxxx> writes:
>>> 
>>> > On Mon 07-08-23 16:39:44, Alistair Popple wrote:
>>> >> When a page fails to migrate move_pages() returns the error code in a
>>> >> per-page array of status values. The function call itself is also
>>> >> supposed to return a summary error code indicating that a failure
>>> >> occurred.
>>> >> 
>>> >> This doesn't always happen. Instead success can be returned even
>>> >> though some pages failed to migrate. This is due to incorrectly
>>> >> returning the error code from store_status() rather than the code from
>>> >> add_page_for_migration. Fix this by only returning an error from
>>> >> store_status() if the store actually failed.
>>> >
>>> > Error reporting by this syscall has been really far from
>>> > straightforward. Please read through a49bd4d71637 and the section "On a
>>> > side note". 
>>> > Is there any specific reason you are trying to address this now or is
>>> > this motivated by the code inspection?
>>> 
>>> Thanks Michal. There was no specific reason to address this now other
>>> than I came across this behaviour when updating the migration selftest
>>> to inspect the status array and thought it was odd. I was seeing pages
>>> had failed to migrate according to the status argument even though
>>> move_pages() had returned 0 (ie. success) rather than a number of
>>> non-migrated pages.
>>
>> It is good to mention such a motivation in the changelog to make it
>> clear. Also do we have a specific test case which trigger this case?
>
> Not explicitly/reliably although I could write one.
>
>>> If I'm interpreting the side note correctly the behaviour you were
>>> concerned about was the opposite - returning a fail return code from
>>> move_pages() but not indicating failure in the status array.

IIUC, we cannot avoid this even if leaving code untouched.  In
move_pages_and_store_status(), if some pages fails to be migrated, we
will not store status.  In fact, we don't know which pages failed to be
migrated in kernel too.

>>> That said I'm happy to leave the behaviour as is, although in that case
>>> an update to the man page is in order to clarify a return value of 0
>>> from move_pages() doesn't actually mean all pages were successfully
>>> migrated.

IMHO, return value is more important than "status" as the reason I
mentioned above.  At least, we can make one thing correct.

>> While I would say that it is better to let old dogs sleep I do not mind
>> changing the behavior and see whether anything breaks. I suspect nobody
>> except for couple of test cases hardcoded to the original behavior will
>> notice.
>>
>>> >> Signed-off-by: Alistair Popple <apopple@xxxxxxxxxx>
>>> >> Fixes: a49bd4d71637 ("mm, numa: rework do_pages_move")
>>
>> The patch itself looks good. I am not sure the fixes tag is accurate.
>> Has the reporting been correct before this change? I didn't have time to
>> re-read the original code which was quite different.
>
> I dug deeper into the history and the fixes tag is wrong. The behaviour
> was actually introduced way back in commit e78bbfa82624 ("mm: stop
> returning -ENOENT from sys_move_pages() if nothing got migrated"). As
> you may guess from the title it was intentional, so suspect it is better
> to update documentation.

Can we change the code but keep the -ENOENT behavior with some special
case?

--
Best Regards,
Huang, Ying

>> Anyway
>> Acked-by: Michal Hocko <mhocko@xxxxxxxx>
>
> Thanks for looking, but I will drop this and see if I can get the man
> page updated.
>
>> Anyway rewriting this function to clarify the error handling would be a
>> nice exercise if somebody is interested.
>
> Yeah, everytime I look at this function I want to do that but haven't
> yet found the time.
>
>>> >> ---
>>> >>  mm/migrate.c | 4 +++-
>>> >>  1 file changed, 3 insertions(+), 1 deletion(-)
>>> >> 
>>> >> diff --git a/mm/migrate.c b/mm/migrate.c
>>> >> index 24baad2571e3..bb3a37245e13 100644
>>> >> --- a/mm/migrate.c
>>> >> +++ b/mm/migrate.c
>>> >> @@ -2222,7 +2222,9 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
>>> >>  		 * If the page is already on the target node (!err), store the
>>> >>  		 * node, otherwise, store the err.
>>> >>  		 */
>>> >> -		err = store_status(status, i, err ? : current_node, 1);
>>> >> +		err1 = store_status(status, i, err ? : current_node, 1);
>>> >> +		if (err1)
>>> >> +			err = err1;
>>> >>  		if (err)
>>> >>  			goto out_flush;
>>> >>  
>>> >> -- 
>>> >> 2.39.2




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux