Re: [PATCH v2] mm/hugetlb: fix race when migrate pages

Michal Hocko <mhocko@xxxxxxx> · Wed, 20 Jul 2016 14:45:02 +0200



On Wed 20-07-16 14:16:45, Michal Hocko wrote:
> On Wed 20-07-16 18:03:40, zhong jiang wrote:
> > On 2016/7/20 15:38, Michal Hocko wrote:
> > > [CC Mike and Naoya]
> > > On Tue 19-07-16 21:45:58, zhongjiang wrote:
> > >> From: zhong jiang <zhongjiang@xxxxxxxxxx>
> > >>
> > >> I hit the following code in huge_pte_alloc when run the database and
> > >> online-offline memory in the system.
> > >>
> > >> BUG_ON(pte && !pte_none(*pte) && !pte_huge(*pte));
> > >>
> > >> when pmd share function enable, we may be obtain a shared pmd entry.
> > >> due to ongoing offline memory , the pmd entry points to the page will
> > >> turn into migrate condition. therefore, the bug will come up.
> > >>
> > >> The patch fix it by checking the pmd entry when we obtain the lock.
> > >> if the shared pmd entry points to page is under migration. we should
> > >> allocate a new pmd entry.
> > >
> > > I am still not 100% sure this is correct. Does huge_pte_lockptr work
> > > properly for the migration swapentry?
> 
> What about this part?
> 
> > > If yes and we populate the pud
> > > with a migration entry then is it really bad/harmful (other than hitting
> > > the BUG_ON which might be update to handle that case)? This might be a
> > > stupid question, sorry about that, but I have really problem to grasp
> > > the whole issue properly and the changelog didn't help me much. I would
> > > really appreciate some clarification here. The pmd sharing code is clear
> > > as mud and adding new tweaks there doesn't sound like it would make it
> > > more clear.
> >
> >     ok, Maybe the following explain will better.
> >     cpu0                                                                      cpu1
> >     try_to_unmap_one                                 huge_pmd_share                             
> >         page_check_address                                 huge_pte_lockptr
> >                       spin_lock                                                        
> >        (page entry can be set to migrate or
> >        Posion )      
> >  
> >          pte_unmap_unlock
> >                                                                             spin_lock
> >                                                                             (page entry have changed)
> 
> Well, but this is just one possible race. We can also race after
> pud_populate:
> 
> cpu0				cpu1
> try_to_unmap_on			huge_pmd_share
>   page_check_address		  huge_pte_lockptr
>   				    spin_lock
> 				    pud_populate
> 				    spin_unlock
>   spin_lock(ptl)
>   set_migration_entry
>   spun_unlock()
>   				  pmd_alloc # we will get migration entry
> 
> So unless I am missing something here we either have to be able to cope
> with migration entries somehow already or we need some additional care
> for shared pmd entries (aka shared pud page) for this to behave
> properly.

I was talking to Mel (CCed) and he has raised a good question. So if you
encounter a page under migration and fail to share the pmd with it how
can you have a proper content of the target page in the end? So not only
the patch doesn't catch the other race I believe it can lead to a
corruption (which is my fault because your previous version did retry
rather than fail). But anyway the primary question is whether seeing
migration entry here is really a problem and just the BUG_ON needs to be
fixed.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>