On 2019-11-14 at 17:30 Naoya Horiguchi wrote: >On Fri, Nov 08, 2019 at 09:01:44PM +0800, Li Xinhai wrote: >> mbind() is required to report EFAULT if range, specified by addr and len, >> contains unmapped holes. In current implementation, below rules are applied >> for this checking: >> 1 Unmapped holes at any part of the specified range should be reported as >> EFAULT if mbind() for none MPOL_DEFAULT cases; >> 2 Unmapped holes at any part of the specified range should be ignored (do >> not reprot EFAULT) if mbind() for MPOL_DEFAULT case; >> 3 The whole range in an unmapped hole should be reported as EFAULT; >> Note that rule 2 does not fullfill the mbind() API definition, but since >> that behavior has existed for long days (the internal flag >> MPOL_MF_DISCONTIG_OK is for this purpose), this patch does not plan to >> change it. >> >> In current code, application observed inconsistent behavior on rule 1 and >> rule 2 respectively. That inconsistency is fixed as below details. >> >> Cases of rule 1: >> 1) Hole at head side of range. Current code reprot EFAULT, no change by >> this patch. >> [ vma ][ hole ][ vma ] >> [ range ] >> 2) Hole at middle of range. Current code report EFAULT, no change by >> this patch. >> [ vma ][ hole ][ vma ] >> [ range ] >> 3) Hole at tail side of range. Current code do not report EFAULT, this >> patch fix it. >> [ vma ][ hole ][ vma ] >> [ range ] >> >> Cases of rule 2: >> 1) Hole at head side of range. Current code reprot EFAULT, this patch >> fix it. >> [ vma ][ hole ][ vma ] >> [ range ] >> 2) Hole at middle of range. Current code do not report EFAULT, no change >> by this patch. >> this patch. >> [ vma ][ hole ][ vma] >> [ range ] >> 3) Hole at tail side of range. Current code do not report EFAULT, no >> change by this patch. >> [ vma ][ hole ][ vma] >> [ range ] >> >> This patch has no changes to rule 3. >> >> The unmapped hole checking can also be handled by using .pte_hole(), >> instead of .test_walk(). But .pte_hole() is called for holes inside and >> outside vma, which causes more cost, so this patch keeps the original >> design with .test_walk(). >> >> Fixes: 6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()") >> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> >> Cc: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> >> Cc: Michal Hocko <mhocko@xxxxxxxx> >> Cc: Vlastimil Babka <vbabka@xxxxxxx> >> Cc: Hugh Dickins <hughd@xxxxxxxxxx> >> Cc: linux-man <linux-man@xxxxxxxxxxxxxxx> >> Signed-off-by: Li Xinhai <lixinhai.lxh@xxxxxxxxx> >> --- >> mm/mempolicy.c | 40 +++++++++++++++++++++++++++------------- >> 1 file changed, 27 insertions(+), 13 deletions(-) >> >> diff --git a/mm/mempolicy.c b/mm/mempolicy.c >> index 807f06f..c697b29 100644 >> --- a/mm/mempolicy.c >> +++ b/mm/mempolicy.c >> @@ -410,7 +410,9 @@ struct queue_pages { >> struct list_head *pagelist; >> unsigned long flags; >> nodemask_t *nmask; >> - struct vm_area_struct *prev; >> + unsigned long start; >> + unsigned long end; >> + struct vm_area_struct *first; >> }; >> >> /* >> @@ -619,14 +621,20 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end, >> unsigned long flags = qp->flags; >> >> /* range check first */ >> - if (!(flags & MPOL_MF_DISCONTIG_OK)) { >> - if (!vma->vm_next && vma->vm_end < end) >> - return -EFAULT; >> - if (qp->prev && qp->prev->vm_end < vma->vm_start) >> + VM_BUG_ON((vma->vm_start > start) || (vma->vm_end < end)); >> + >> + if (!qp->first) { >> + qp->first = vma; >> + if (!(flags & MPOL_MF_DISCONTIG_OK) && >> + (qp->start < vma->vm_start)) >> + /* hole at head side of range */ >> return -EFAULT; >> } >> - >> - qp->prev = vma; >> + if (!(flags & MPOL_MF_DISCONTIG_OK) && >> + ((vma->vm_end < qp->end) && > >You here have a trailing whitespace. > >Otherwise, looks good to me. > >Reviewed-by: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> Thanks for review, I'v sent out new patch with same title - Xinhai