On Wed, 31 May 2023, Jason Gunthorpe wrote: > On Sun, May 28, 2023 at 11:16:16PM -0700, Hugh Dickins wrote: > > There is a faint risk that __pte_offset_map(), on a 32-bit architecture > > with a 64-bit pmd_t e.g. x86-32 with CONFIG_X86_PAE=y, would succeed on > > a pmdval assembled from a pmd_low and a pmd_high which never belonged > > together: their combination not pointing to a page table at all, perhaps > > not even a valid pfn. pmdp_get_lockless() is not enough to prevent that. > > > > Guard against that (on such configs) by local_irq_save() blocking TLB > > flush between present updates, as linux/pgtable.h suggests. It's only > > needed around the pmdp_get_lockless() in __pte_offset_map(): a race when > > __pte_offset_map_lock() repeats the pmdp_get_lockless() after getting the > > lock, would just send it back to __pte_offset_map() again. > > What about the other places calling pmdp_get_lockless ? It seems like > this is quietly making it part of the API that the caller must hold > the IPIs off. No, I'm making no judgment of other places where pmdp_get_lockless() is used: examination might show that some need more care, but I'll just assume that each is taking as much care as it needs. But here where I'm making changes, I do see that we need this extra care. > > And Jann had a note that this approach used by the lockless functions > doesn't work anyhow: > > https://lore.kernel.org/linux-mm/CAG48ez3h-mnp9ZFC10v+-BW_8NQvxbwBsMYJFP8JX31o0B17Pg@xxxxxxxxxxxxxx/ Thanks a lot for the link: I don't know why, but I never saw that mail thread at all before. I have not fully digested it yet, to be honest: MADV_DONTNEED, doesn't flush TLB yet, etc - I'll have to get into the right frame of mind for that. > > Though we never fixed it, AFAIK.. I'm certainly depending very much on pmdp_get_lockless(): and hoping to find its case is easier to defend than at the ptep_get_lockless() level. Thanks, Hugh