Re: [PATCH 1/2] mm/migrate.c: Fix return code when migration fails

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon 07-08-23 22:31:52, Alistair Popple wrote:
> 
> Michal Hocko <mhocko@xxxxxxxx> writes:
> 
> > On Mon 07-08-23 16:39:44, Alistair Popple wrote:
> >> When a page fails to migrate move_pages() returns the error code in a
> >> per-page array of status values. The function call itself is also
> >> supposed to return a summary error code indicating that a failure
> >> occurred.
> >> 
> >> This doesn't always happen. Instead success can be returned even
> >> though some pages failed to migrate. This is due to incorrectly
> >> returning the error code from store_status() rather than the code from
> >> add_page_for_migration. Fix this by only returning an error from
> >> store_status() if the store actually failed.
> >
> > Error reporting by this syscall has been really far from
> > straightforward. Please read through a49bd4d71637 and the section "On a
> > side note". 
> > Is there any specific reason you are trying to address this now or is
> > this motivated by the code inspection?
> 
> Thanks Michal. There was no specific reason to address this now other
> than I came across this behaviour when updating the migration selftest
> to inspect the status array and thought it was odd. I was seeing pages
> had failed to migrate according to the status argument even though
> move_pages() had returned 0 (ie. success) rather than a number of
> non-migrated pages.

It is good to mention such a motivation in the changelog to make it
clear. Also do we have a specific test case which trigger this case?

> If I'm interpreting the side note correctly the behaviour you were
> concerned about was the opposite - returning a fail return code from
> move_pages() but not indicating failure in the status array.
> 
> That said I'm happy to leave the behaviour as is, although in that case
> an update to the man page is in order to clarify a return value of 0
> from move_pages() doesn't actually mean all pages were successfully
> migrated.

While I would say that it is better to let old dogs sleep I do not mind
changing the behavior and see whether anything breaks. I suspect nobody
except for couple of test cases hardcoded to the original behavior will
notice.

> >> Signed-off-by: Alistair Popple <apopple@xxxxxxxxxx>
> >> Fixes: a49bd4d71637 ("mm, numa: rework do_pages_move")

The patch itself looks good. I am not sure the fixes tag is accurate.
Has the reporting been correct before this change? I didn't have time to
re-read the original code which was quite different.

Anyway
Acked-by: Michal Hocko <mhocko@xxxxxxxx>

Anyway rewriting this function to clarify the error handling would be a
nice exercise if somebody is interested.

> >> ---
> >>  mm/migrate.c | 4 +++-
> >>  1 file changed, 3 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/mm/migrate.c b/mm/migrate.c
> >> index 24baad2571e3..bb3a37245e13 100644
> >> --- a/mm/migrate.c
> >> +++ b/mm/migrate.c
> >> @@ -2222,7 +2222,9 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
> >>  		 * If the page is already on the target node (!err), store the
> >>  		 * node, otherwise, store the err.
> >>  		 */
> >> -		err = store_status(status, i, err ? : current_node, 1);
> >> +		err1 = store_status(status, i, err ? : current_node, 1);
> >> +		if (err1)
> >> +			err = err1;
> >>  		if (err)
> >>  			goto out_flush;
> >>  
> >> -- 
> >> 2.39.2

-- 
Michal Hocko
SUSE Labs




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux