On 07/11/2018 04:21 PM, Peter Xu wrote:
On Thu, Jun 28, 2018 at 05:33:58PM +0800, Xiao Guangrong wrote:
On 06/19/2018 03:36 PM, Peter Xu wrote:
On Mon, Jun 04, 2018 at 05:55:15PM +0800, guangrong.xiao@xxxxxxxxx wrote:
From: Xiao Guangrong <xiaoguangrong@xxxxxxxxxxx>
Try to hold src_page_req_mutex only if the queue is not
empty
Pure question: how much this patch would help? Basically if you are
running compression tests then I think it means you are with precopy
(since postcopy cannot work with compression yet), then here the lock
has no contention at all.
Yes, you are right, however we can observe it is in the top functions
(after revert this patch):
Samples: 29K of event 'cycles', Event count (approx.): 22263412260
+ 7.99% kqemu qemu-system-x86_64 [.] ram_bytes_total
+ 6.95% kqemu [kernel.kallsyms] [k] copy_user_enhanced_fast_string
+ 6.23% kqemu qemu-system-x86_64 [.] qemu_put_qemu_file
+ 6.20% kqemu qemu-system-x86_64 [.] qemu_event_set
+ 5.80% kqemu qemu-system-x86_64 [.] __ring_put
+ 4.82% kqemu qemu-system-x86_64 [.] compress_thread_data_done
+ 4.11% kqemu qemu-system-x86_64 [.] ring_is_full
+ 3.07% kqemu qemu-system-x86_64 [.] threads_submit_request_prepare
+ 2.83% kqemu qemu-system-x86_64 [.] ring_mp_get
+ 2.71% kqemu qemu-system-x86_64 [.] __ring_is_full
+ 2.46% kqemu qemu-system-x86_64 [.] buffer_zero_sse2
+ 2.40% kqemu qemu-system-x86_64 [.] add_to_iovec
+ 2.21% kqemu qemu-system-x86_64 [.] ring_get
+ 1.96% kqemu [kernel.kallsyms] [k] __lock_acquire
+ 1.90% kqemu libc-2.12.so [.] memcpy
+ 1.55% kqemu qemu-system-x86_64 [.] ring_len
+ 1.12% kqemu libpthread-2.12.so [.] pthread_mutex_unlock
+ 1.11% kqemu qemu-system-x86_64 [.] ram_find_and_save_block
+ 1.07% kqemu qemu-system-x86_64 [.] ram_save_host_page
+ 1.04% kqemu qemu-system-x86_64 [.] qemu_put_buffer
+ 0.97% kqemu qemu-system-x86_64 [.] compress_page_with_multi_thread
+ 0.96% kqemu qemu-system-x86_64 [.] ram_save_target_page
+ 0.93% kqemu libpthread-2.12.so [.] pthread_mutex_lock
(sorry to respond late; I was busy with other stuff for the
release...)
You're welcome. :)
I am trying to find out anything related to unqueue_page() but I
failed. Did I miss anything obvious there?
unqueue_page() was not listed here indeed, i think the function
itself is light enough (a check then directly return) so it
did not leave a trace here.
This perf data was got after reverting this patch, i.e, it's
based on the lockless multithread model, then unqueue_page() is
the only place using mutex in the main thread.
And you can see the overload of mutext was gone after applying
this patch in the mail i replied to Dave.