On Tue, Nov 20, 2012 at 05:09:18PM +0100, Ingo Molnar wrote: > > Ok, the patch withstood a bit more testing as well. Below is a > v2 version of it, with a couple of cleanups (no functional > changes). > > Thanks, > > Ingo > > -----------------> > Subject: mm, numa: Turn 4K pte NUMA faults into effective hugepage ones > From: Ingo Molnar <mingo@xxxxxxxxxx> > Date: Tue Nov 20 15:48:26 CET 2012 > > Reduce the 4K page fault count by looking around and processing > nearby pages if possible. > > To keep the logic and cache overhead simple and straightforward > we do a couple of simplifications: > > - we only scan in the HPAGE_SIZE range of the faulting address > - we only go as far as the vma allows us > > Also simplify the do_numa_page() flow while at it and fix the > previous double faulting we incurred due to not properly fixing > up freshly migrated ptes. > > Suggested-by: Mel Gorman <mgorman@xxxxxxx> > Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> > Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> > Cc: Rik van Riel <riel@xxxxxxxxxx> > Cc: Hugh Dickins <hughd@xxxxxxxxxx> > Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> > --- > mm/memory.c | 99 ++++++++++++++++++++++++++++++++++++++---------------------- > 1 file changed, 64 insertions(+), 35 deletions(-) > This is functionally similar to what balancenuma does but there is one key difference worth noting. I only mark the PMD pmd_numa if all the pages pointed to by the updated[*] PTEs underneath are on the same node. The intention is that if the workload is converged on a PMD boundary then a migration of all the pages underneath will be remote->local copies. If the workload is not converged on a PMD boundary and you handle all the faults then you are potentially incurring remote->remote copies. It also means that if the workload is not converged on the PMD boundary then a PTE fault is just one page. With yours, it will be the full PMD every time, right? [*] Note I said only the updated ptes are checked. I do not check every PTE underneath. I could but felt the benefit would be marginal. -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>