On Thu, Dec 15, 2022 at 12:52 PM Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: > > On 12/13/22 10:49, James Houghton wrote: > > On Mon, Dec 12, 2022 at 7:14 PM Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: > > > > > > On 10/21/22 16:36, James Houghton wrote: > > > > Currently it is possible for all shared VMAs to use HGM, but it must be > > > > enabled first. This is because with HGM, we lose PMD sharing, and page > > > > table walks require additional synchronization (we need to take the VMA > > > > lock). > > > > > > Not sure yet, but I expect Peter's series will help with locking for > > > hugetlb specific page table walks. > > > > It should make things a little bit cleaner in this series; I'll rebase > > HGM on top of those patches this week (and hopefully get a v1 out > > soon). > > > > I don't think it's possible to implement MADV_COLLAPSE with RCU alone > > (as implemented in Peter's series anyway); we still need the VMA lock. > > As I continue going through the series, I realize that I am not exactly > sure what synchronization by the vma lock is required by HGM. As you are > aware, it was originally designed to protect against someone doing a > pmd_unshare and effectively removing part of the page table. However, > since pmd sharing is disabled for vmas with HGM enabled (I think?), then > it might be a good idea to explicitly say somewhere the reason for using > the lock. It synchronizes MADV_COLLAPSE for hugetlb (hugetlb_collapse). MADV_COLLAPSE will take it for writing and free some page table pages, and high-granularity walks will generally take it for reading. I'll make this clear in a comment somewhere and in commit messages. It might be easier if hugetlb_collapse() had the exact same synchronization as huge_pmd_unshare, where we not only take the VMA lock for writing, we also take the i_mmap_rw_sem for writing, so anywhere where hugetlb_walk() is safe, high-granularity walks are also safe. I think I should just do that for the sake of simplicity. - James > -- > Mike Kravetz