Christopher Lameter <cl@xxxxxxxxx> writes: > On Mon, 7 Aug 2017, Huang, Ying wrote: > >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -4374,9 +4374,31 @@ void clear_huge_page(struct page *page, >> } >> >> might_sleep(); >> - for (i = 0; i < pages_per_huge_page; i++) { >> + VM_BUG_ON(clamp(addr_hint, addr, addr + >> + (pages_per_huge_page << PAGE_SHIFT)) != addr_hint); >> + n = (addr_hint - addr) / PAGE_SIZE; >> + if (2 * n <= pages_per_huge_page) { >> + base = 0; >> + l = n; >> + for (i = pages_per_huge_page - 1; i >= 2 * n; i--) { >> + cond_resched(); >> + clear_user_highpage(page + i, addr + i * PAGE_SIZE); >> + } > > I really like the idea behind the patch but this is not clearing from last > to first byte of the huge page. > > What seems to be happening here is clearing from the last page to the > first page and I would think that within each page the clearing is from > first byte to last byte. Maybe more gains can be had by really clearing > from last to first byte of the huge page instead of this jumping over 4k > addresses? I changed the code to use clear_page_orig() and make it clear pages from last to first. The patch is as below. With that, there is no visible changes in benchmark result. But the cache miss rate dropped a little from 27.64% to 26.70%. The cache miss rate is different with before because the clear_page() implementation used is different. I think this is because the size of page is relative small compared with the cache size, so that the effect is almost invisible. Best Regards, Huang, Ying --------------->8---------------- diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h index b4a0d43248cf..01d201afde92 100644 --- a/arch/x86/include/asm/page_64.h +++ b/arch/x86/include/asm/page_64.h @@ -42,8 +42,8 @@ void clear_page_erms(void *page); static inline void clear_page(void *page) { alternative_call_2(clear_page_orig, - clear_page_rep, X86_FEATURE_REP_GOOD, - clear_page_erms, X86_FEATURE_ERMS, + clear_page_orig, X86_FEATURE_REP_GOOD, + clear_page_orig, X86_FEATURE_ERMS, "=D" (page), "0" (page) : "memory", "rax", "rcx"); diff --git a/arch/x86/lib/clear_page_64.S b/arch/x86/lib/clear_page_64.S index 81b1635d67de..23e6238e625d 100644 --- a/arch/x86/lib/clear_page_64.S +++ b/arch/x86/lib/clear_page_64.S @@ -25,19 +25,20 @@ EXPORT_SYMBOL_GPL(clear_page_rep) ENTRY(clear_page_orig) xorl %eax,%eax movl $4096/64,%ecx + addq $4096-64,%rdi .p2align 4 .Lloop: decl %ecx #define PUT(x) movq %rax,x*8(%rdi) - movq %rax,(%rdi) - PUT(1) - PUT(2) - PUT(3) - PUT(4) - PUT(5) - PUT(6) PUT(7) - leaq 64(%rdi),%rdi + PUT(6) + PUT(5) + PUT(4) + PUT(3) + PUT(2) + PUT(1) + movq %rax,(%rdi) + leaq -64(%rdi),%rdi jnz .Lloop nop ret -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>