Re: Transparent Hugepage impact on memcpy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2013/6/4 20:30, Wanpeng Li wrote:

> On Tue, Jun 04, 2013 at 04:57:57PM +0800, Jianguo Wu wrote:
>> Hi all,
>>
>> I tested memcpy with perf bench, and found that in prefault case, When Transparent Hugepage is on,
>> memcpy has worse performance.
>>
>> When THP on is 3.672879 GB/Sec (with prefault), while THP off is 6.190187 GB/Sec (with prefault).
>>
> 
> I get similar result as you against 3.10-rc4 in the attachment. This
> dues to the characteristic of thp takes a single page fault for each 
> 2MB virtual region touched by userland.
>

Hi Wanpeng,
Thanks for your reply:).

 

This test is with prefault, so it would not count page fault time in, and I think less page fault
will improve memcpy performance, right?

Test results from perf stat show a significant reduction in cache-references and cache-misses
when THP is off, do you have any idea about this?

Thanks,
Jianguo Wu.

>> I think THP will improve performance, but the test result obviously not the case. 
>> Andrea mentioned THP cause "clear_page/copy_page less cache friendly" in
>> http://events.linuxfoundation.org/slides/2011/lfcs/lfcs2011_hpc_arcangeli.pdf.
>>
>> I am not quite understand this, could you please give me some comments, Thanks!
>>
>> I test in Linux-3.4-stable, and my machine info is:
>> Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
>>
>> available: 2 nodes (0-1)
>> node 0 cpus: 0 1 2 3 8 9 10 11
>> node 0 size: 24567 MB
>> node 0 free: 23550 MB
>> node 1 cpus: 4 5 6 7 12 13 14 15
>> node 1 size: 24576 MB
>> node 1 free: 23767 MB
>> node distances:
>> node   0   1 
>>  0:  10  20 
>>  1:  20  10
>>
>> Below is test result:
>> ---with THP---
>> #cat /sys/kernel/mm/transparent_hugepage/enabled
>> [always] madvise never
>> #./perf bench mem memcpy -l 1gb -o
>> # Running mem/memcpy benchmark...
>> # Copying 1gb Bytes ...
>>
>>       3.672879 GB/Sec (with prefault)
>>
>> #./perf stat ...
>> Performance counter stats for './perf bench mem memcpy -l 1gb -o':
>>
>>          35455940 cache-misses              #   53.504 % of all cache refs     [49.45%]
>>          66267785 cache-references                                             [49.78%]
>>              2409 page-faults                                                 
>>         450768651 dTLB-loads
>>                                                  [50.78%]
>>             24580 dTLB-misses
>>              #    0.01% of all dTLB cache hits  [51.01%]
>>        1338974202 dTLB-stores
>>                                                 [50.63%]
>>             77943 dTLB-misses
>>                                                 [50.24%]
>>         697404997 iTLB-loads
>>                                                  [49.77%]
>>               274 iTLB-misses
>>              #    0.00% of all iTLB cache hits  [49.30%]
>>
>>       0.855041819 seconds time elapsed
>>
>> ---no THP---
>> #cat /sys/kernel/mm/transparent_hugepage/enabled
>> always madvise [never]
>>
>> #./perf bench mem memcpy -l 1gb -o
>> # Running mem/memcpy benchmark...
>> # Copying 1gb Bytes ...
>>
>>       6.190187 GB/Sec (with prefault)
>>
>> #./perf stat ...
>> Performance counter stats for './perf bench mem memcpy -l 1gb -o':
>>
>>          16920763 cache-misses              #   98.377 % of all cache refs     [50.01%]
>>          17200000 cache-references                                             [50.04%]
>>            524652 page-faults                                                 
>>         734365659 dTLB-loads
>>                                                  [50.04%]
>>           4986387 dTLB-misses
>>              #    0.68% of all dTLB cache hits  [50.04%]
>>        1013408298 dTLB-stores
>>                                                 [50.04%]
>>           8180817 dTLB-misses
>>                                                 [49.97%]
>>        1526642351 iTLB-loads
>>                                                  [50.41%]
>>                56 iTLB-misses
>>              #    0.00% of all iTLB cache hits  [50.21%]
>>
>>       1.025425847 seconds time elapsed
>>
>> Thanks,
>> Jianguo Wu.
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]