Re: + mm-hugetlb-fix-race-when-migrate-pages.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 21, 2016 at 10:33:47PM +0800, zhong jiang wrote:
> On 2016/7/21 22:27, Michal Hocko wrote:
> > On Thu 21-07-16 22:13:55, zhong jiang wrote:
> >> On 2016/7/21 22:01, Michal Hocko wrote:
> >>> On Thu 21-07-16 21:58:23, zhong jiang wrote:
> >>>> On 2016/7/21 21:40, Michal Hocko wrote:
> >>>>> On Thu 21-07-16 21:25:38, zhong jiang wrote:
> >>>>>> On 2016/7/21 20:55, Michal Hocko wrote:
> >>>>> [...]
> >>>>>>> OK, now I understand what you mean. So you mean that a different process
> >>>>>>> initiates the migration while this path copies to pte. That is certainly
> >>>>>>> possible but I still fail to see what is the problem about that.
> >>>>>>> huge_pte_alloc will return the identical pte whether it is regular or
> >>>>>>> migration one. So what exactly is the problem?
> >>>>>>>
> >>>>>> copy_hugetlb_page_range obtain the shared dst_pte, it may be not equal
> >>>>>> to the src_pte.  The dst_pte can come from other process sharing the
> >>>>>> mapping.
> >>>>> So you mean that the parent doesn't have the shared pte while the child
> >>>>> would get one?
> >>>>>  
> >>>>  no, parent must have the shared pte because the the child copy the
> >>>> parent. but parent is not the only source pte we can get. when we
> >>>> scan the maping->i_mmap, firstly ,it can obtain a shared pte from
> >>>> other process. but I am not sure.
> >>> But then all the shared ptes should be identical, no? Or am I missing
> >>> something?
> >>  all the shared ptes should be identical, but  there is  a possibility that new process
> >>  want to share the pte from other process ,  other than the parent,  For the first time
> >>  the process is about to share pte with it.   is it possiblity?
> > I do not see how. They are opperating on the same mapping so I really do
> > not see how different process makes any difference.
> >
>    ok , In a words . the new process get the shared pte, The shared pte not come from the parent process.
>   so , src_pte is not equal to dst_pte.  because src_pte come from the parent, while dst_pte come from
>   other process.    obviously, it is not same. 

I think that (src_pte != dst_pte) can happen and that's ok if there's no
migration entry.  But even if we have both of normal entry and migration entry
for one hugepage, that still looks fine to me because the running migration
operation fails (because there remains mapcounts on the source hugepage),
and all migration entries are turned back to normal entries pointing to the
source hugepage.

Could you try to see and share what happens on your workload with Michal's patch?
If something weird/critical still happens, let's merge your patch.
# I'm trying to write some test cases for it, but might take some time ...

Thanks,
Naoya Horiguchi
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]