Hi Jason, > > > No, adding HMM_PFN_REQ_WRITE still doesn't help in fixing the issue. > > Although, I do not have THP enabled (or built-in), shmem does not evict > > the pages after hole punch as noted in the comment in shmem_fallocate(): > > This is the source of all your problems. > > Things that are mm-centric are supposed to track the VMAs and changes to > the PTEs. If you do something in userspace and it doesn't cause the > CPU page tables to change then it certainly shouldn't cause any mmu > notifiers or hmm_range_fault changes. I am not doing anything out of the blue in the userspace. I think the behavior I am seeing with shmem (where an invalidation event (MMU_NOTIFY_CLEAR) does occur because of a hole punch but the PTEs don't really get updated) can arguably be considered an optimization. > > There should still be an invalidation notifier at some point when the > CPU tables do eventually change, whenever that is. Missing that > notification would be a bug. I clearly do not see any notification getting triggered (from both shmem_fault() and hugetlb_fault()) when the PTEs do get updated as the hole is refilled due to writes. Are you saying that there needs to be an invalidation event (MMU_NOTIFY_CLEAR?) dispatched at this point? > > > If I force it to read-fault or write-fault (by hacking hmm_pte_need_fault()), > > it gets indefinitely stuck in the do while loop in hmm_range_fault(). > > AFAIU, unless there is a way to fault-in zero pages (or any scratch pages) > > after hole punch that get invalidated because of writes, I do not see how > > using hmm_range_fault() can help with my use-case. > > hmm_range_fault() is the correct API to use if you are working with > notifiers. Do not hack something together using pin_user_pages. I noticed that hmm_range_fault() does not seem to be working as expected given that it gets stuck(hangs) while walking hugetlb pages. Regardless, as I mentioned above, the lack of notification when PTEs do get updated due to writes is the crux of the issue here. Therefore, AFAIU, triggering an invalidation event or some other kind of notification would help in fixing this issue. Thanks, Vivek > > Jason