On 8/18/21 4:35 PM, Mina Almasry wrote: > On Fri, Aug 13, 2021 at 4:40 PM Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: >> Earlier in mremap code, this following lines exist: >> >> old_len = PAGE_ALIGN(old_len); >> new_len = PAGE_ALIGN(new_len); >> >> So, the passed length values are page aligned. This allows 'sloppy' >> values to be passed by users. >> >> Should we do the same for hugetlb mappings? In mmap we have different >> requirements for hugetlb mappings: >> >> " Huge page (Huge TLB) mappings >> For mappings that employ huge pages, the requirements for the arguments >> of mmap() and munmap() differ somewhat from the requirements for map‐ >> pings that use the native system page size. >> >> For mmap(), offset must be a multiple of the underlying huge page size. >> The system automatically aligns length to be a multiple of the underly‐ >> ing huge page size. >> >> For munmap(), addr and length must both be a multiple of the underlying >> huge page size. >> " >> >> I actually wish arguments for hugetlb mappings would be treated the same >> as for base page size mappings. We can not change mmap as legacy code >> may depend on the different requirements. Since mremap for hugetlb is >> new, should we treat arguments for hugetlb mappings the same as for base >> pages (align to huge page boundary)? My vote is yes, but it would be >> good to get other opinions. >> >> If we do not align for hugetlb mappings as we do for base page mappings, >> then this will also need to be documented. >> >> Another question, >> Should we possibly check addr and new_addr alignment here as well? >> addr has been previously checked for PAGE alignment and new_addr is >> checked for PAGE alignment at the beginning of mremap_to(). >> > > I'll yield to whatever you decide here because I reckon you have much > more experience and better judgement here. But my thoughts: > > 'Sane' usage of mremap() is something like: > 1. mmap() a hugetlbfs vma. > 2. Pass the vma received from step (1) to mremap() to remap it to a > different location. > > I don't know if there is another usage pattern I need to worry about > but given the above, old_addr and old_len will be hugepage aligned > already since they are values returned by the previous mmap() call > which aligns them, no? So, I think aligning old_addr and old_len to > the hugepage boundary is fine. > > With this support we don't allow mremap() expansion. In my use case > old_len==new_len acutally. I think it's fine to also align new_len to > the hugepage boundary > > I already have this code that errors out if the lengths are not aligned: > > if (old_len & ~huge_page_mask(h) || new_len & ~huge_page_mask(h)) > goto out; > > I think aligning new_addr breaks my use case though. In my use case > new_addr is the start of the text segment in the ELF executable, and I > don't think that's guaranteed to be anything but page aligned. > Aligning new_addr seems like it would break my use case. That is interesting. I assumed there was hugetlb code written under the assumption vmas/mappings were always huge page aligned. I thought the code would fall over quite quickly if vma was not huge page aligned. Your use case/statement above surprised me. So, I took your provided test case (V3 patch)and tried to make destination address be non-huge page aligned: just page aligned. In every case, mremap would fail. The routine hugetlb_get_unmapped_area() required huge page alignment. Not sure how this works for you? > Aligning new_addr seems like it would break my use case. If you insist > though I'm happy aligning new_addr in the upstream kernel and not > doing that in our kernel, but if I'm not particularly happy with the > hugepage alignment I'd say it is likely future users of hugetlb > mremap() also won't like the hugepage alignement, but I yield to you > here. I am now a bit confused and do not see how this works for your use case? -- Mike Kravetz