On 27.01.22 18:52, Axel Rasmussen wrote: > On Thu, Jan 27, 2022 at 3:57 AM David Hildenbrand <david@xxxxxxxxxx> wrote: >> >> On 13.01.22 19:03, Mike Kravetz wrote: >>> Userfaultfd selftests for hugetlb does not perform UFFD_EVENT_REMAP >>> testing. However, mremap support was recently added in commit >>> 550a7d60bd5e ("mm, hugepages: add mremap() support for hugepage backed >>> vma"). While attempting to enable mremap support in the test, it was >>> discovered that the mremap test indirectly depends on MADV_DONTNEED. >>> >>> hugetlb does not support MADV_DONTNEED. However, the only thing >>> preventing support is a check in can_madv_lru_vma(). Simply removing >>> the check will enable support. >>> >>> This is sent as a RFC because there is no existing use case calling >>> for hugetlb MADV_DONTNEED support except possibly the userfaultfd test. >>> However, adding support makes sense as it is fairly trivial and brings >>> hugetlb functionality more in line with 'normal' memory. >>> >> >> Just a note: >> >> QEMU doesn't use huge anonymous memory directly (MAP_ANON | MAP_HUGE...) >> but instead always goes either via hugetlbfs or via memfd. >> >> For MAP_PRIVATE hugetlb mappings, fallocate(FALLOC_FL_PUNCH_HOLE) seems >> to get the job done (IOW: also discards private anon pages). See the >> comments in the QEMU code below. I remember that that is somewhat >> inconsistent. For ordinary MAP_PRIVATE mapped files I remember that we >> always need fallocate(FALLOC_FL_PUNCH_HOLE) + madvise(QEMU_MADV_DONTNEED) >> to make sure >> >> a) All file pages are removed >> b) All private anon pages are removed >> >> IIRC hugetlbfs really is different in that regard, but maybe other fs >> behave similarly. >> >> That's why QEMU was able to live for now without MADV_DONTNEED support >> for hugetlbfs and most probably won't ever need it. > > Agreed, all of the production use cases I'm aware of use hugetlbfs, > not MAP_HUGE... > > But, I would say this is convenient for testing purposes. It's > slightly more convenient to not have to mount hugetlbfs / perform the > associated setup for tests. Creating a memfd is not too hard, but yes, not a single-liner. Maybe the uffd test should go via memfds for hugetlb instead. But maybe that limits the mremap functionality? No expert. > > Perhaps that's only a small motivation for enabling this, but then > again Mike's patch to do so is likewise very small. :) ... and apparently buggy :P -- Thanks, David / dhildenb