On Thu, Jun 28, 2018 at 05:33:58PM +0800, Xiao Guangrong wrote: > > > On 06/19/2018 03:36 PM, Peter Xu wrote: > > On Mon, Jun 04, 2018 at 05:55:15PM +0800, guangrong.xiao@xxxxxxxxx wrote: > > > From: Xiao Guangrong <xiaoguangrong@xxxxxxxxxxx> > > > > > > Try to hold src_page_req_mutex only if the queue is not > > > empty > > > > Pure question: how much this patch would help? Basically if you are > > running compression tests then I think it means you are with precopy > > (since postcopy cannot work with compression yet), then here the lock > > has no contention at all. > > Yes, you are right, however we can observe it is in the top functions > (after revert this patch): > > Samples: 29K of event 'cycles', Event count (approx.): 22263412260 > + 7.99% kqemu qemu-system-x86_64 [.] ram_bytes_total > + 6.95% kqemu [kernel.kallsyms] [k] copy_user_enhanced_fast_string > + 6.23% kqemu qemu-system-x86_64 [.] qemu_put_qemu_file > + 6.20% kqemu qemu-system-x86_64 [.] qemu_event_set > + 5.80% kqemu qemu-system-x86_64 [.] __ring_put > + 4.82% kqemu qemu-system-x86_64 [.] compress_thread_data_done > + 4.11% kqemu qemu-system-x86_64 [.] ring_is_full > + 3.07% kqemu qemu-system-x86_64 [.] threads_submit_request_prepare > + 2.83% kqemu qemu-system-x86_64 [.] ring_mp_get > + 2.71% kqemu qemu-system-x86_64 [.] __ring_is_full > + 2.46% kqemu qemu-system-x86_64 [.] buffer_zero_sse2 > + 2.40% kqemu qemu-system-x86_64 [.] add_to_iovec > + 2.21% kqemu qemu-system-x86_64 [.] ring_get > + 1.96% kqemu [kernel.kallsyms] [k] __lock_acquire > + 1.90% kqemu libc-2.12.so [.] memcpy > + 1.55% kqemu qemu-system-x86_64 [.] ring_len > + 1.12% kqemu libpthread-2.12.so [.] pthread_mutex_unlock > + 1.11% kqemu qemu-system-x86_64 [.] ram_find_and_save_block > + 1.07% kqemu qemu-system-x86_64 [.] ram_save_host_page > + 1.04% kqemu qemu-system-x86_64 [.] qemu_put_buffer > + 0.97% kqemu qemu-system-x86_64 [.] compress_page_with_multi_thread > + 0.96% kqemu qemu-system-x86_64 [.] ram_save_target_page > + 0.93% kqemu libpthread-2.12.so [.] pthread_mutex_lock (sorry to respond late; I was busy with other stuff for the release...) I am trying to find out anything related to unqueue_page() but I failed. Did I miss anything obvious there? > > I guess its atomic operations cost CPU resource and check-before-lock is > a common tech, i think it shouldn't have side effect, right? :) Yeah it makes sense to me. :) Thanks, -- Peter Xu