Re: [RFC PATCH] mm: correct status code which move_pages() returns for zero page

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue 17-04-18 22:28:33, Li Wang wrote:
> On Tue, Apr 17, 2018 at 10:14 PM, Michal Hocko <mhocko@xxxxxxxx> wrote:
> 
> > On Tue 17-04-18 15:03:00, Michal Hocko wrote:
> > > On Tue 17-04-18 19:06:15, Li Wang wrote:
> > > [...]
> > > > diff --git a/mm/migrate.c b/mm/migrate.c
> > > > index f65dd69..2b315fc 100644
> > > > --- a/mm/migrate.c
> > > > +++ b/mm/migrate.c
> > > > @@ -1608,7 +1608,7 @@ static int do_pages_move(struct mm_struct *mm,
> > nodemask_t task_nodes,
> > > >                     continue;
> > > >
> > > >             err = store_status(status, i, err, 1);
> > > > -           if (err)
> > > > +           if (!err)
> > > >                     goto out_flush;
> > >
> > > This change just doesn't make any sense to me. Why should we bail out if
> > > the store_status is successul? I am trying to wrap my head around the
> > > test case. 6b9d757ecafc ("mm, numa: rework do_pages_move") tried to
> > > explain that move_pages has some semantic issues and the new
> > > implementation might be not 100% replacement. Anyway I am studying the
> > > test case to come up with a proper fix.
> >
> > OK, I get what the test cases does. I've failed to see the subtle
> > difference between alloc_pages_on_node and numa_alloc_onnode. The later
> > doesn't faul in anything.
> >
> > Why are we getting EPERM is quite not yet clear to me.
> > add_page_for_migration uses FOLL_DUMP which should return EFAULT on
> > zero pages (no_page_table()).
> >
> >         err = PTR_ERR(page);
> >         if (IS_ERR(page))
> >                 goto out;
> >
> > therefore bails out from add_page_for_migration and store_status should
> > store that value. There shouldn't be any EPERM on the way.
> >
> 
> Yes, I print the the return value and confirmed the
> add_page_for_migration()​
> do right things for zero page. and after store_status(...) the status saves
> -EFAULT.
> So I did the change above.

OK, I guess I knnow what is going on. I must be overwriting the status
on the way out by

out_flush:
	/* Make sure we do not overwrite the existing error */
	err1 = do_move_pages_to_node(mm, &pagelist, current_node);
	if (!err1)
		err1 = store_status(status, start, current_node, i - start);

This error handling is rather fragile and I was quite unhappy about it
at the time I was developing it. I have to remember all the details why
I've done it that way but I would bet my hat this is it. More on this
tomorrow.
-- 
Michal Hocko
SUSE Labs




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux