On Mon, May 29, 2023 at 8:11 AM Hugh Dickins <hughd@xxxxxxxxxx> wrote: > Here is the third series of patches to mm (and a few architectures), based > on v6.4-rc3 with the preceding two series applied: in which khugepaged > takes advantage of pte_offset_map[_lock]() allowing for pmd transitions. To clarify: Part of the design here is that when you look up a user page table with pte_offset_map_nolock() or pte_offset_map() without holding mmap_lock in write mode, and you later lock the page table yourself, you don't know whether you actually have the real page table or a detached table that is currently in its RCU grace period, right? And detached tables are supposed to consist of only zeroed entries, and we assume that no relevant codepath will do anything bad if one of these functions spuriously returns a pointer to a page table full of zeroed entries? So in particular, in handle_pte_fault() we can reach the "if (unlikely(!pte_same(*vmf->pte, entry)))" with vmf->pte pointing to a detached zeroed page table, but we're okay with that because in that case we know that !pte_none(vmf->orig_pte)&&pte_none(*vmf->pte) , which implies !pte_same(*vmf->pte, entry) , which means we'll bail out? If that's the intent, it might be good to add some comments, because at least to me that's not very obvious.