Re: [PATCH 06/43] mm: numa: Make pte_numa() and pmd_numa() a generic implementation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 16, 2012 at 06:12:43PM +0100, Ingo Molnar wrote:
> 
> * Mel Gorman <mgorman@xxxxxxx> wrote:
> 
> > > Why not use something what we have in numa/core already:
> > > 
> > >   f05ea0948708 mm/mpol: Create special PROT_NONE infrastructure
> > > 
> > 
> > Because it's hard-coded to PROT_NONE underneath which I've 
> > complained about before. [...]
> 
> To which I replied that this is the current generic 
> implementation, the moment some different architecture comes 
> around we can accomodate it - on a strictly as-needed basis.
> 

To which I responded that a new architecutre would have to retrofit and
then change callers like change_prot_none() which is more churn than should
be necessary to add architecture support.

> It is *better* and cleaner to not expose random arch hooks but 
> let the core kernel modification be documented in the very patch 
> that the architecture support patch makes use of it.
> 

And yours requires that arches define pmd_pgprot so there are additional
hooks anyway.

That said, your approach just ends up being heavier. Take this simple
case for what we need for pte_numa.

+static inline pgprot_t vma_prot_none(struct vm_area_struct *vma)
+{
+       /*
+        * obtain PROT_NONE by removing READ|WRITE|EXEC privs
+        */
+       vm_flags_t vmflags = vma->vm_flags & ~(VM_READ|VM_WRITE|VM_EXEC);
+       return pgprot_modify(vma->vm_page_prot, vm_get_page_prot(vmflags));
+}

...

+static bool pte_numa(struct vm_area_struct *vma, pte_t pte)
+{
+       /*
+        * For NUMA page faults, we use PROT_NONE ptes in VMAs with
+        * "normal" vma->vm_page_prot protections.  Genuine PROT_NONE
+        * VMAs should never get here, because the fault handling code
+        * will notice that the VMA has no read or write permissions.
+        *
+        * This means we cannot get 'special' PROT_NONE faults from genuine
+        * PROT_NONE maps, nor from PROT_WRITE file maps that do dirty
+        * tracking.
+        *
+        * Neither case is really interesting for our current use though so we
+        * don't care.
+        */
+       if (pte_same(pte, pte_modify(pte, vma->vm_page_prot)))
+               return false;
+
+       return pte_same(pte, pte_modify(pte, vma_prot_none(vma)));
+}

pte_numa requires a call to vma_prot_none which requires a function call
to vm_get_page_prot.

This is the _PAGE_NUMA equivalent.

+__weak int pte_numa(pte_t pte)
+{
+       return (pte_flags(pte) &
+               (_PAGE_NUMA|_PAGE_PRESENT)) == _PAGE_NUMA;
+}

If that was moved to inline as Linus suggests, it becomes one, maybe two
instructions.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]