On 05/19/2020 04:57 AM, Andrew Morton wrote: > On Mon, 18 May 2020 13:08:49 +0800 Bibo Mao <maobibo@xxxxxxxxxxx> wrote: > >> On mips platform, hw PTE entry valid bit is set in pte_mkyoung >> function, it is used to set physical page with readable privilege. > > pte_mkyoung() seems to be a strange place to set the pte's valid bit. > Why is it done there? Can it be done within mips's mk_pte()? On MIPS system hardware cannot set PAGE_ACCESS bit when accessing the page, software sets PAGE_ACCESS software bit and PAGE_VALID hw bit together during page fault stage. If mk_pte is called in page fault flow, it is ok to set both bits. If it is not called in page fault, PAGE_ACCESS is set however there is no actual memory accessing. > >> Here add pte_mkyoung function to make page readable on MIPS platform >> during page fault handling. This patch improves page fault latency >> about 10% on my MIPS machine with lmbench lat_pagefault case. >> >> ... >> >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -2704,6 +2704,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) >> } >> flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte)); >> entry = mk_pte(new_page, vma->vm_page_prot); >> + entry = pte_mkyoung(entry); >> entry = maybe_mkwrite(pte_mkdirty(entry), vma); > > What is the effect on non-mips machines? If it's only "additional > overhead" then it would be better to add a new pte_mkvalid() (or > whatever) and arrange for it to be a no-op on all but mips? how about adding pte_sw_mkyoung alike function which is a noop on all but mips? this function is used to set PAGE_ACCESS bit and PAGE_VALID bit on mips platform. regards bibo,mao > >> /* >> * Clear the pte entry and flush it first, before updating the >> @@ -3378,6 +3379,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) >> __SetPageUptodate(page); >> >> entry = mk_pte(page, vma->vm_page_prot); >> + entry = pte_mkyoung(entry); >> if (vma->vm_flags & VM_WRITE) >> entry = pte_mkwrite(pte_mkdirty(entry)); >> >> @@ -3660,6 +3662,7 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg, >> >> flush_icache_page(vma, page); >> entry = mk_pte(page, vma->vm_page_prot); >> + entry = pte_mkyoung(entry); >> if (write) >> entry = maybe_mkwrite(pte_mkdirty(entry), vma); >> /* copy-on-write page */ >> diff --git a/mm/mprotect.c b/mm/mprotect.c >> index 494192ca..673f1cd 100644 >> --- a/mm/mprotect.c >> +++ b/mm/mprotect.c >> @@ -131,6 +131,8 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, >> ptent = pte_clear_uffd_wp(ptent); >> } >> >> + if (vma->vm_flags & VM_READ) >> + ptent = pte_mkyoung(ptent); >> /* Avoid taking write faults for known dirty pages */ >> if (dirty_accountable && pte_dirty(ptent) && >> (pte_soft_dirty(ptent) ||