On Thu, Jul 25, 2024 at 05:23:48PM -0700, James Houghton wrote: > On Thu, Jul 25, 2024 at 3:41 PM Peter Xu <peterx@xxxxxxxxxx> wrote: > > > > On Thu, Jul 25, 2024 at 11:29:49AM -0700, James Houghton wrote: > > > > - pages += change_pmd_range(tlb, vma, pud, addr, next, newprot, > > > > + > > > > + if (pud_leaf(pud)) { > > > > + if ((next - addr != PUD_SIZE) || > > > > + pgtable_split_needed(vma, cp_flags)) { > > > > + __split_huge_pud(vma, pudp, addr); > > > > + goto again; > > > > > > IIUC, most of the time, we're just going to end up clearing the PUD in > > > this case. __split_huge_pud() will just clear it, then we goto again > > > and `continue` to the next pudp. Is that ok? > > > > > > (I think it's ok as long as: you never map an anonymous page with a > > > PUD, > > > > I think this is true. > > > > > and that uffd-wp is not usable with non-hugetlb PUD mappings of > > > user memory (which I think is only DAX?). > > > > Uffd-wp has the async mode that can even work with dax puds.. even though I > > don't think anyone should be using it. Just like I'm more sure that nobody > > is using mprotect() too with dax pud, and it further justifies why nobody > > cared this much.. > > > > What uffd-wp would do in this case is it'll make pgtable_split_needed() > > returns true on this PUD, the PUD got wiped out, goto again, then > > change_prepare() will populate this pud with a pgtable page. Then it goes > > downwards, install PMD pgtable, then probably start installing pte markers > > ultimately if it's a wr-protect operation. > > Ah, I didn't understand this patch correctly. You're right, you'll > install PMDs underneath -- I thought we'd just find pud_none() and > bail out. So this all makes sense, thanks! > > > > > > So it seems ok today...?) > > > > Yes I think it's ok so far, unless you think it's not. :) > > > > > > > > Also, does the comment in pgtable_split_needed() need updating? > > > > /* > > * Return true if we want to split THPs into PTE mappings in change > > * protection procedure, false otherwise. > > */ > > > > It looks to me it's ok for now to me? THP can represents PUD in dax, and we > > indeed want to break THPs (no matter PUD/PMD) finally into PTE mappings. > > Oh, heh I was thinking of the other comment: > > /* > * pte markers only resides in pte level, if we need pte markers, > * we need to split. We cannot wr-protect shmem thp because file > * thp is handled differently when split by erasing the pmd so far. > */ > > I guess this is fine too, it just kind of reads as if this function is > only called for PMDs. *shrug* Ah ok, yeah it looks mostly fine here too. Let's make this easy by keeping it as-is, but if there'll be a new version I can touch it up if it helps the readings. > > > > > > > > > Somewhat related question: change_huge_pmd() is very careful not to > > > clear the PMD before writing the new value. Yet change_pmd_range(), > > > when it calls into __split_huge_pmd(), will totally clear the PMD and > > > then populate the PTEs underneath (in some cases at least), seemingly > > > reintroducing the MADV_DONTNEED concern. But your PUD version, because > > > it never re-populates the PUD (or PMDs/PTEs underneath) does not have > > > this issue. WDYT? > > I guess I'm wrong about this -- your PUD version is the same as the > PMD version in this respect: both clear the PUD/PMD and then install > lower page table entries. > > > > > Could you elaborate more on the DONTNEED issue you're mentioning here? > > In change_huge_pmd(), there is a comment about not clearing the pmd > before setting the new value. I guess the issue is: if a user is > MADV_DONTNEEDing some memory and happens to see a cleared PMD, it will > just move on. But in this case, if change_huge_pmd() temporarily > cleared a PMD, then MADV_DONTNEED saw it and moved on, and then > change_huge_pmd() installed a new PMD, the MADV_DONTNEED has basically > skipped over the PMD, probably not what the user wanted. It seems like > we have the same issue when a PMD is cleared when we're splitting it. > > But yeah, given that your PUD version is apparently no different from > the PMD version in this respect, maybe this question isn't very > interesting. :) Ah that one, so yeah that's why I introduced pudp_invalidate() in this series to make sure that issue isn't there. Then I assume we're good. Thanks, -- Peter Xu