On 2020-01-15 at 13:23 Yang Shi wrote: > > >On 1/14/20 8:28 PM, Mike Kravetz wrote: >> On 1/14/20 5:24 PM, Yang Shi wrote: >>> >>> On 1/14/20 5:07 PM, Mike Kravetz wrote: >>>> On 1/14/20 6:09 AM, Li Xinhai wrote: >>>>> Add cc to >>>>> Yang Shi <yang.shi@xxxxxxxxxxxxxxxxx> >>>>> Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> >>>>> , who has been worked on this part >>>>> >>>>> On 2020-01-14 at 17:16 Li Xinhai wrote: >>>>>> Checking MPOL_MF_STRICT is ignored for HUGETLB vma according to mbind man >>>>>> page: >>>>>> >>>>>> Notes >>>>>> MPOL_MF_STRICT is ignored on huge page mappings. >>>>>> >>>>>> If MPOL_MF_STRICT is specified alone without any MOVE flag, we should >>>>>> indicate, from test_walk, that walking this vma should be skipped even if >>>>>> there are misplaced pages. >>>>>> >>>>>> Signed-off-by: Li Xinhai <lixinhai.lxh@xxxxxxxxx> >>>>>> Cc: Michal Hocko <mhocko@xxxxxxxx> >>>>>> Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx> >>>> I do not necessarily disagree with the change. However, this has made me >>>> question a couple things: >>>> 1) Why does the man page say MPOL_MF_STRICT is ignored on huge page mappings? >>>> - Is that leftover from the the days when huge page migration was not >>>> supported? >>>> - Is it just because huge page migration is more likely to fail than >>>> base page migration. >>>> 2) Does the mbind code function properly when unable to migrate a huge page >>>> MPOL_MF_STRICT is set? A quick look at the code looks like it returns >>>> EIO. for question (2), look into queue_pages_hugetlb(), the misplaced page would not cause -EIO reported, for both STRICT set alone and STRICT set with MOVE*; that means STRICT been effectively ignored during isolation phase. In unmap and move phase, -EIO is reported if failed to move page and STRICT is set. >>> I don't know the answer about question #1 I didn't dig into the history. The queue_pages_hugetlb() returns 0 unconditionally, I think this is what "MPOL_MF_STRICT is ignored on huge page mappings" means in code. >>> >>> It would return -EIO for base pages or THP as what the manpage describes. >>> >> I was thinking about a migration failure after isolation. This block of >> code in do_mbind() after queue_pages_range() and mbind_range(). >> >> if (!err) { >> int nr_failed = 0; >> >> if (!list_empty(&pagelist)) { >> WARN_ON_ONCE(flags & MPOL_MF_LAZY); >> nr_failed = migrate_pages(&pagelist, new_page, NULL, >> start, MIGRATE_SYNC, MR_MEMPOLICY_MBIND); >> if (nr_failed) >> putback_movable_pages(&pagelist); >> } >> >> if ((ret > 0) || (nr_failed && (flags & MPOL_MF_STRICT))) >> err = -EIO; > >Hmm.. I agree this part in man page does look ambiguous. We may assume >"MPOL_MF_STRICT is ignored on huge page mappings." implies if >MPOL_MF_STRICT is specified alone? If MOVE flag is specified it should >return -EIO if some pages could not be moved as what the man page describes. > It looks to me that current code has no feasible way to ignore STRICT flag for hugetlb page when failure happen in unmap&move phase, because mbind is about to handle multiple vma(i.e., hugetlb vma mixed with other vma) in one call. >I don't know what the intention was at the first place. We may have to >dig into the history. > >> > >