On 8/28/21 06:23, Qi Zheng wrote: > The smp_wmb() which is in the __pte_alloc() is used to > ensure all ptes setup is visible before the pte is made > visible to other CPUs by being put into page tables. We > only need this when the pte is actually populated, so > move it to pte_install(). __pte_alloc_kernel(), It's named pmd_install()? > __p4d_alloc(), __pud_alloc() and __pmd_alloc() are similar > to this case. > > We can also defer smp_wmb() to the place where the pmd entry > is really populated by preallocated pte. There are two kinds > of user of preallocated pte, one is filemap & finish_fault(), > another is THP. The former does not need another smp_wmb() > because the smp_wmb() has been done by pte_install(). Same here. > Fortunately, the latter also does not need another smp_wmb() > because there is already a smp_wmb() before populating the > new pte when the THP uses a preallocated pte to split a huge > pmd. > > Signed-off-by: Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx> > Reviewed-by: Muchun Song <songmuchun@xxxxxxxxxxxxx> > --- > mm/memory.c | 47 ++++++++++++++++++++--------------------------- > mm/sparse-vmemmap.c | 2 +- > 2 files changed, 21 insertions(+), 28 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index ef7b1762e996..9c7534187454 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -439,6 +439,20 @@ void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte) > > if (likely(pmd_none(*pmd))) { /* Has another populated it ? */ > mm_inc_nr_ptes(mm); > + /* > + * Ensure all pte setup (eg. pte page lock and page clearing) are > + * visible before the pte is made visible to other CPUs by being > + * put into page tables. > + * > + * The other side of the story is the pointer chasing in the page > + * table walking code (when walking the page table without locking; > + * ie. most of the time). Fortunately, these data accesses consist > + * of a chain of data-dependent loads, meaning most CPUs (alpha > + * being the notable exception) will already guarantee loads are > + * seen in-order. See the alpha page table accessors for the > + * smp_rmb() barriers in page table walking code. > + */ > + smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */ So, could it? :)