Re: [RFC PATCH v2 33/47] userfaultfd: add UFFD_FEATURE_MINOR_HUGETLBFS_HGM

James Houghton <jthoughton@xxxxxxxxxx> · Wed, 21 Dec 2022 20:24:45 -0500

> > So considering two API choices:
> >
> > 1. What we have now: UFFD_FEATURE_MINOR_HUGETLBFS_HGM for
> > UFFDIO_CONTINUE, and later UFFD_FEATURE_WP_HUGETLBFS_HGM for
> > UFFDIO_WRITEPROTECT. For MADV_DONTNEED, we could just suddenly start
> > allowing high-granularity choices (not sure if this is bad; we started
> > allowing it for HugeTLB recently with no other API change, AFAIA).
>
> I don't think we can just start allowing HGM for MADV_DONTNEED without
> some type of user interaction/request.  Otherwise, a user that passes
> in non-hugetlb page size requests may get unexpected results.  And, one
> of the threads about MADV_DONTNEED points out a valid use cases where
> the caller may not know the mapping is hugetlb or not and is likely to
> pass in non-hugetlb page size requests.
>
> > 2. MADV_ENABLE_HGM or something similar. The changes to
> > UFFDIO_CONTINUE/UFFDIO_WRITEPROTECT/MADV_DONTNEED come automatically,
> > provided they are implemented.
> >
> > I don't mind one way or the other. Peter, I assume you prefer #2.
> > Mike, what about you? If we decide on something other than #1, I'll
> > make the change before sending v1 out.
>
> Since I do not believe 1) is an option, MADV_ENABLE_HGM might be the way
> to go.  Any thoughts about MADV_ENABLE_HGM?  I'm thinking:
> - Make it have same restrictions as other madvise hugetlb calls,
>   . addr must be huge page aligned
>   . length is rounded down to a multiple of huge page size
> - We split the vma as required
I agree with these.
> - Flags carrying HGM state reside in the hugetlb_shared_vma_data struct
I actually changed this in v1 to storing HGM state as a VMA flag to
avoid problems with splitting VMAs (like, when we split a VMA, it's
possible the VMA data/lock struct doesn't get allocated). It seems
better to me; I can change it back if you disagree.

Not sure what the best name for this flag is either. MADV_ENABLE_HGM
sounds ok. MADV_HUGETLB_HGM or MADV_HUGETLB_SMALL_PAGES could work
too. No need to figure it out now.

Thanks Mike and Peter :) I'll make this change for v1 and send it out
sometime soon.

- James