On 2024/8/28 18:48, David Hildenbrand wrote:
On 27.08.24 06:33, Qi Zheng wrote:
[...]
sufficient AFAIUK.
Drop the "AFAIUK" :)
"For R/O access this is sufficient."
pte_offset_map_rw_nolock(mm, pmd, addr, pmdvalp, ptlp), above, is like
pte_offset_map_ro_nolock(); but when successful, it also outputs the
pdmval. For R/W access, the callers can not accept that the page table
it sees has been unmapped and is about to get freed. The pmdval can help
callers to recheck pmd_same() to identify this case once the spinlock is
taken. For some cases where exclusivity is already guaranteed, such as
holding the write lock of mmap_lock, or in cases where checking is
sufficient, such as a !pte_none() pte will be rechecked after the
spinlock is taken, there is no need to recheck pdmval.
Right, using pte_same() one can achieve a similar result, assuming that
the freed page table gets all ptes set to pte_none().
page_table_check_pte_clear_range() before pte_free_defer() in
retract_page_tables/collapse_pte_mapped_thp() sanity checks that I think.
Since commit 1d65b771bc08, retract_page_tables() only holds the
i_mmap_lock_read(mapping) but not mmap_lock, so it seems that
holding the write lock of mmap_lock cannot guarantee the stability
of the PTE page.
IIUC, I will also perform a pmd_same() check on the case where the
write lock of mmap_lock is held in v3. Or do I miss something?
In collapse_huge_page() that is not the case. But here, we also
currently grab all heavily locks, to prevent any concurrent page table
walker.
Note: "RO" / "RW" expresses the intended semantics, not that the *kmap*
will be RO/RW protected.
Good. Please also incorporate the feedback from Muchun.