On 2020-03-29 at 11:53 John Hubbard wrote: >On 3/28/20 8:08 PM, Li Xinhai wrote: >> In current code, the vma related call of hugetlb mapping, except mmap, >> are all consider not correctly aligned length as invalid parameter, >> including mprotect,munmap, mlock, etc., by checking through >> hugetlb_vm_op_split. So, user will see failure, after successfully call >> mmap, although using same length parameter to other mapping syscall. >> >> It is desirable for all hugetlb mapping calls have consistent behavior, >> without mmap as exception(which round up length to align underlying >> hugepage size). In current Documentation/admin-guide/mm/hugetlbpage.rst, >> the description is: >> " >> Syscalls that operate on memory backed by hugetlb pages only have their >> lengths aligned to the native page size of the processor; they will >> normally fail with errno set to EINVAL or exclude hugetlb pages that >> extend beyond the length if not hugepage aligned. For example, munmap(2) >> will fail if memory is backed by a hugetlb page and the length is smaller >> than the hugepage size. >> " >> which express the consistent behavior. > > >Missing here is a description of what the patch actually does... > right, more statement can be added like: " After this patch, all hugetlb mapping related syscall wil only align length parameter to the native page size of the processor. For mmap(), hugetlb_get_unmmaped_area() will set errno to EINVAL if length is not aligned to underlying hugepage size. " >> >> Signed-off-by: Li Xinhai <lixinhai.lxh@xxxxxxxxx> >> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> >> Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx> >> Cc: John Hubbard <jhubbard@xxxxxxxxxx> >> --- >> changes: >> 0. patch which introduce new flag for mmap() >> The new flag should be avoided. >> https://lore.kernel.org/linux-mm/1585313944-8627-1-git-send-email-lixinhai.lxh@xxxxxxxxx/ >> >> mm/mmap.c | 8 -------- >> 1 file changed, 8 deletions(-) >> >> diff --git a/mm/mmap.c b/mm/mmap.c >> index d681a20..b2aa102 100644 >> --- a/mm/mmap.c >> +++ b/mm/mmap.c >> @@ -1560,20 +1560,12 @@ unsigned long ksys_mmap_pgoff(unsigned long addr, unsigned long len, >> file = fget(fd); >> if (!file) >> return -EBADF; >> - if (is_file_hugepages(file)) >> - len = ALIGN(len, huge_page_size(hstate_file(file))); > > >...and it looks like this is simply removing the forced alignment. And not adding >any error case for non-aligned cases. So now I'm not immediately sure what happens if a >non-aligned address is passed in. > >I would have expected to see either error checking or an ALIGN call here, but now both >are gone, so I'm lost and confused. :) > After this patch, the alignement will only on "native page size of the processor" as done in do_mmap(). Then, following the code path, checking further by hugetlb_get_unmmaped_area() according to underlying hugepage size. > >thanks, >-- >John Hubbard >NVIDIA > >> retval = -EINVAL; >> if (unlikely(flags & MAP_HUGETLB && !is_file_hugepages(file))) >> goto out_fput; >> } else if (flags & MAP_HUGETLB) { >> struct user_struct *user = NULL; >> - struct hstate *hs; >> >> - hs = hstate_sizelog((flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK); >> - if (!hs) >> - return -EINVAL; >> - >> - len = ALIGN(len, huge_page_size(hs)); >> /* >> * VM_NORESERVE is used because the reservations will be >> * taken when vm_ops->mmap() is called >>