On 8 Jun 2023, at 21:10, Hugh Dickins wrote: > Make pte_offset_map() a wrapper for __pte_offset_map() (optionally > outputs pmdval), pte_offset_map_lock() a sparse __cond_lock wrapper for > __pte_offset_map_lock(): those __funcs added in mm/pgtable-generic.c. > > __pte_offset_map() do pmdval validation (including pmd_clear_bad() > when pmd_bad()), returning NULL if pmdval is not for a page table. > __pte_offset_map_lock() verify pmdval unchanged after getting the > lock, trying again if it changed. > > No #ifdef CONFIG_TRANSPARENT_HUGEPAGE around them: that could be done > to cover the imminent case, but we expect to generalize it later, and > it makes a mess of where to do the pmd_bad() clearing. > > Add pte_offset_map_nolock(): outputs ptl like pte_offset_map_lock(), > without actually taking the lock. This will be preferred to open uses of > pte_lockptr(), because (when split ptlock is in page table's struct page) > it points to the right lock for the returned pte pointer, even if *pmd > gets changed racily afterwards. > > Update corresponding Documentation. > > Do not add the anticipated rcu_read_lock() and rcu_read_unlock()s yet: > they have to wait until all architectures are balancing pte_offset_map()s > with pte_unmap()s (as in the arch series posted earlier). But comment > where they will go, so that it's easy to add them for experiments. And > only when those are in place can transient racy failure cases be enabled. > Add more safety for the PAE mismatched pmd_low pmd_high case at that time. > > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> > --- > Documentation/mm/split_page_table_lock.rst | 17 ++++--- > include/linux/mm.h | 27 +++++++---- > include/linux/pgtable.h | 22 ++++++--- > mm/pgtable-generic.c | 56 ++++++++++++++++++++++ > 4 files changed, 101 insertions(+), 21 deletions(-) LGTM. Reviewed-by: Zi Yan <ziy@xxxxxxxxxx> -- Best Regards, Yan, Zi
Attachment:
signature.asc
Description: OpenPGP digital signature