Re: [PATCH 1/2] mm/hugetlb: Fix F_SEAL_FUTURE_WRITE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Mike,

On Mon, May 03, 2021 at 11:55:41AM -0700, Mike Kravetz wrote:
> On 5/1/21 7:41 AM, Peter Xu wrote:
> > F_SEAL_FUTURE_WRITE is missing for hugetlb starting from the first day.
> > There is a test program for that and it fails constantly.
> > 
> > $ ./memfd_test hugetlbfs
> > memfd-hugetlb: CREATE
> > memfd-hugetlb: BASIC
> > memfd-hugetlb: SEAL-WRITE
> > memfd-hugetlb: SEAL-FUTURE-WRITE
> > mmap() didn't fail as expected
> > Aborted (core dumped)
> > 
> > I think it's probably because no one is really running the hugetlbfs test.
> > 
> > Fix it by checking FUTURE_WRITE also in hugetlbfs_file_mmap() as what we do in
> > shmem_mmap().  Generalize a helper for that.
> > 
> > Reported-by: Hugh Dickins <hughd@xxxxxxxxxx>
> > Signed-off-by: Peter Xu <peterx@xxxxxxxxxx>
> > ---
> >  fs/hugetlbfs/inode.c |  5 +++++
> >  include/linux/mm.h   | 32 ++++++++++++++++++++++++++++++++
> >  mm/shmem.c           | 22 ++++------------------
> >  3 files changed, 41 insertions(+), 18 deletions(-)
> 
> Thanks Peter and Hugh!
> 
> One question below,
> 
> > 
> > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> > index a2a42335e8fd2..39922c0f2fc8c 100644
> > --- a/fs/hugetlbfs/inode.c
> > +++ b/fs/hugetlbfs/inode.c
> > @@ -131,10 +131,15 @@ static void huge_pagevec_release(struct pagevec *pvec)
> >  static int hugetlbfs_file_mmap(struct file *file, struct vm_area_struct *vma)
> >  {
> >  	struct inode *inode = file_inode(file);
> > +	struct hugetlbfs_inode_info *info = HUGETLBFS_I(inode);
> >  	loff_t len, vma_len;
> >  	int ret;
> >  	struct hstate *h = hstate_file(file);
> >  
> > +	ret = seal_check_future_write(info->seals, vma);
> > +	if (ret)
> > +		return ret;
> > +
> >  	/*
> >  	 * vma address alignment (but not the pgoff alignment) has
> >  	 * already been checked by prepare_hugepage_range.  If you add
> 
> The full comment below the code you added is:
> 
> 	/*
> 	 * vma address alignment (but not the pgoff alignment) has
> 	 * already been checked by prepare_hugepage_range.  If you add
> 	 * any error returns here, do so after setting VM_HUGETLB, so
> 	 * is_vm_hugetlb_page tests below unmap_region go the right
> 	 * way when do_mmap unwinds (may be important on powerpc
> 	 * and ia64).
> 	 */
> 
> This comment was added in commit 68589bc35303 by Hugh, although it
> appears David Gibson added the reason for the comment in the commit
> message:
> 
> "If hugetlbfs_file_mmap() returns a failure to do_mmap_pgoff() - for example,
> because the given file offset is not hugepage aligned - then do_mmap_pgoff
> will go to the unmap_and_free_vma backout path.
> 
> But at this stage the vma hasn't been marked as hugepage, and the backout path
> will call unmap_region() on it.  That will eventually call down to the
> non-hugepage version of unmap_page_range().  On ppc64, at least, that will
> cause serious problems if there are any existing hugepage pagetable entries in
> the vicinity - for example if there are any other hugepage mappings under the
> same PUD.  unmap_page_range() will trigger a bad_pud() on the hugepage pud
> entries.  I suspect this will also cause bad problems on ia64, though I don't
> have a machine to test it on."
> 
> There are still comments in the unmap code about special handling of
> ppc64 PUDs.  So, this may still be an issue.
> 
> I am trying to dig into the code to determine if this is still and
> issue.  Just curious if you looked into this?  Might be simpler and
> safer to just put the seal check after setting the VM_HUGETLB flag?

Good catch!  I overlooked on that, and I definitely didn't look into it yet.
For now I'd better move that check to be after the flag settings in all cases.

I'll also add:

Fixes: ab3948f58ff84 ("mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd")

Thanks,

-- 
Peter Xu





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux