hi all, I met similar problem to these, while performing live migration or save-restore test on the kvm platform (qemu:1.4.0, host:suse11sp2, guest:suse11sp2), running tele-communication software suite in guest, https://lists.gnu.org/archive/html/qemu-devel/2013-05/msg00098.html http://comments.gmane.org/gmane.comp.emulators.kvm.devel/102506 http://thread.gmane.org/gmane.comp.emulators.kvm.devel/100592 https://bugzilla.kernel.org/show_bug.cgi?id=58771 After live migration or virsh restore [savefile], one process's CPU utilization went up by about 30%, resulted in throughput degradation of this process. oprofile report on this process in guest, pre live migration: CPU: CPU with timer interrupt, speed 0 MHz (estimated) Profiling through timer interrupt samples % app name symbol name 248 12.3016 no-vmlinux (no symbols) 78 3.8690 libc.so.6 memset 68 3.3730 libc.so.6 memcpy 30 1.4881 cscf.scu SipMmBufMemAlloc 29 1.4385 libpthread.so.0 pthread_mutex_lock 26 1.2897 cscf.scu SipApiGetNextIe 25 1.2401 cscf.scu DBFI_DATA_Search 20 0.9921 libpthread.so.0 __pthread_mutex_unlock_usercnt 16 0.7937 cscf.scu DLM_FreeSlice 16 0.7937 cscf.scu receivemessage 15 0.7440 cscf.scu SipSmCopyString 14 0.6944 cscf.scu DLM_AllocSlice post live migration: CPU: CPU with timer interrupt, speed 0 MHz (estimated) Profiling through timer interrupt samples % app name symbol name 1586 42.2370 libc.so.6 memcpy 271 7.2170 no-vmlinux (no symbols) 83 2.2104 libc.so.6 memset 41 1.0919 libpthread.so.0 __pthread_mutex_unlock_usercnt 35 0.9321 cscf.scu SipMmBufMemAlloc 29 0.7723 cscf.scu DLM_AllocSlice 28 0.7457 libpthread.so.0 pthread_mutex_lock 23 0.6125 cscf.scu SipApiGetNextIe 17 0.4527 cscf.scu SipSmCopyString 16 0.4261 cscf.scu receivemessage 15 0.3995 cscf.scu SipcMsgStatHandle 14 0.3728 cscf.scu Urilex 12 0.3196 cscf.scu DBFI_DATA_Search 12 0.3196 cscf.scu SipDsmGetHdrBitValInner 12 0.3196 cscf.scu SipSmGetDataFromRefString So, memcpy costs much more cpu cycles after live migration. Then, I restart the process, this problem disappeared. save-restore has the similar problem. perf report on vcpu thread in host, pre live migration: Performance counter stats for thread id '21082': 0 page-faults 0 minor-faults 0 major-faults 31616 cs 506 migrations 0 alignment-faults 0 emulation-faults 5075957539 L1-dcache-loads [21.32%] 324685106 L1-dcache-load-misses # 6.40% of all L1-dcache hits [21.85%] 3681777120 L1-dcache-stores [21.65%] 65251823 L1-dcache-store-misses # 1.77% [22.78%] 0 L1-dcache-prefetches [22.84%] 0 L1-dcache-prefetch-misses [22.32%] 9321652613 L1-icache-loads [22.60%] 1353418869 L1-icache-load-misses # 14.52% of all L1-icache hits [21.92%] 169126969 LLC-loads [21.87%] 12583605 LLC-load-misses # 7.44% of all LL-cache hits [ 5.84%] 132853447 LLC-stores [ 6.61%] 10601171 LLC-store-misses #7.9% [ 5.01%] 25309497 LLC-prefetches #30% [ 4.96%] 7723198 LLC-prefetch-misses [ 6.04%] 4954075817 dTLB-loads [11.56%] 26753106 dTLB-load-misses # 0.54% of all dTLB cache hits [16.80%] 3553702874 dTLB-stores [22.37%] 4720313 dTLB-store-misses #0.13% [21.46%] <not counted> dTLB-prefetches <not counted> dTLB-prefetch-misses 60.000920666 seconds time elapsed post live migration: Performance counter stats for thread id '1579': 0 page-faults [100.00%] 0 minor-faults [100.00%] 0 major-faults [100.00%] 34979 cs [100.00%] 441 migrations [100.00%] 0 alignment-faults [100.00%] 0 emulation-faults 6903585501 L1-dcache-loads [22.06%] 525939560 L1-dcache-load-misses # 7.62% of all L1-dcache hits [21.97%] 5042552685 L1-dcache-stores [22.20%] 94493742 L1-dcache-store-misses #1.8% [22.06%] 0 L1-dcache-prefetches [22.39%] 0 L1-dcache-prefetch-misses [22.47%] 13022953030 L1-icache-loads [22.25%] 1957161101 L1-icache-load-misses # 15.03% of all L1-icache hits [22.47%] 348479792 LLC-loads [22.27%] 80662778 LLC-load-misses # 23.15% of all LL-cache hits [ 5.64%] 198745620 LLC-stores [ 5.63%] 14236497 LLC-store-misses # 7.1% [ 5.41%] 20757435 LLC-prefetches [ 5.42%] 5361819 LLC-prefetch-misses # 25% [ 5.69%] 7235715124 dTLB-loads [11.26%] 49895163 dTLB-load-misses # 0.69% of all dTLB cache hits [16.96%] 5168276218 dTLB-stores [22.44%] 6765983 dTLB-store-misses #0.13% [22.24%] <not counted> dTLB-prefetches <not counted> dTLB-prefetch-misses The "LLC-load-misses" went up by about 16%. Then, I restarted the process in guest, the perf data back to normal, Performance counter stats for thread id '1579': 0 page-faults [100.00%] 0 minor-faults [100.00%] 0 major-faults [100.00%] 30594 cs [100.00%] 327 migrations [100.00%] 0 alignment-faults [100.00%] 0 emulation-faults 7707091948 L1-dcache-loads [22.10%] 559829176 L1-dcache-load-misses # 7.26% of all L1-dcache hits [22.28%] 5976654983 L1-dcache-stores [23.22%] 160436114 L1-dcache-store-misses [22.80%] 0 L1-dcache-prefetches [22.51%] 0 L1-dcache-prefetch-misses [22.53%] 13798415672 L1-icache-loads [22.28%] 2017724676 L1-icache-load-misses # 14.62% of all L1-icache hits [22.49%] 254598008 LLC-loads [22.86%] 16035378 LLC-load-misses # 6.30% of all LL-cache hits [ 5.36%] 307019606 LLC-stores [ 5.60%] 13665033 LLC-store-misses [ 5.43%] 17715554 LLC-prefetches [ 5.57%] 4187006 LLC-prefetch-misses [ 5.44%] 7811502895 dTLB-loads [10.72%] 40547330 dTLB-load-misses # 0.52% of all dTLB cache hits [16.31%] 6144202516 dTLB-stores [21.58%] 6313363 dTLB-store-misses [21.91%] <not counted> dTLB-prefetches <not counted> dTLB-prefetch-misses 60.000812523 seconds time elapsed If EPT disabled, this problem gone. I suspect that kvm hypervisor has business with this problem. Based on above suspect, I want to find the two adjacent versions of kvm-kmod which triggers this problem or not (e.g. 2.6.39, 3.0-rc1), and analyze the differences between this two versions, or apply the patches between this two versions by bisection method, finally find the key patches. Any better ideas? Thanks, Zhang Haoyu -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html