> > So considering two API choices: > > > > 1. What we have now: UFFD_FEATURE_MINOR_HUGETLBFS_HGM for > > UFFDIO_CONTINUE, and later UFFD_FEATURE_WP_HUGETLBFS_HGM for > > UFFDIO_WRITEPROTECT. For MADV_DONTNEED, we could just suddenly start > > allowing high-granularity choices (not sure if this is bad; we started > > allowing it for HugeTLB recently with no other API change, AFAIA). > > I don't think we can just start allowing HGM for MADV_DONTNEED without > some type of user interaction/request. Otherwise, a user that passes > in non-hugetlb page size requests may get unexpected results. And, one > of the threads about MADV_DONTNEED points out a valid use cases where > the caller may not know the mapping is hugetlb or not and is likely to > pass in non-hugetlb page size requests. > > > 2. MADV_ENABLE_HGM or something similar. The changes to > > UFFDIO_CONTINUE/UFFDIO_WRITEPROTECT/MADV_DONTNEED come automatically, > > provided they are implemented. > > > > I don't mind one way or the other. Peter, I assume you prefer #2. > > Mike, what about you? If we decide on something other than #1, I'll > > make the change before sending v1 out. > > Since I do not believe 1) is an option, MADV_ENABLE_HGM might be the way > to go. Any thoughts about MADV_ENABLE_HGM? I'm thinking: > - Make it have same restrictions as other madvise hugetlb calls, > . addr must be huge page aligned > . length is rounded down to a multiple of huge page size > - We split the vma as required I agree with these. > - Flags carrying HGM state reside in the hugetlb_shared_vma_data struct I actually changed this in v1 to storing HGM state as a VMA flag to avoid problems with splitting VMAs (like, when we split a VMA, it's possible the VMA data/lock struct doesn't get allocated). It seems better to me; I can change it back if you disagree. Not sure what the best name for this flag is either. MADV_ENABLE_HGM sounds ok. MADV_HUGETLB_HGM or MADV_HUGETLB_SMALL_PAGES could work too. No need to figure it out now. Thanks Mike and Peter :) I'll make this change for v1 and send it out sometime soon. - James