1. Problem During live migration, if the guest tries to take mmu_lock at the same time as GET_DIRTY_LOG, which is called periodically by QEMU, it may be forced to wait long time; this is not restricted to page faults caused by GET_DIRTY_LOG's write protection. 2. Measurement - Server: Xeon: 8 cores(2 CPUs), 24GB memory - One VM was being migrated locally to the opposite numa node: Source(active) VM: binded to node 0 Target(incoming) VM: binded to node 1 This binding was for reducing extra noise. - The guest inside it: 3 VCPUs, 11GB memory - Workload: On VCPU 2 and 3, there were 3 threads and each of them was endlessly writing to 3GB, in total 9GB, anonymous memory at its maximum speed. I had checked that GET_DIRTY_LOG was forced to write protect more than 2 million pages. So the 9GB memory was almost always kept dirty to be sent. In parallel, on VCPU 1, I checked memory write latency: how long it takes to write to one byte of each page in 1GB anonymous memory. - Result: With the current KVM, I could see 1.5ms worst case latency: this corresponds well with the expected mmu_lock hold time. Here, you may think that this is too small compared to the numbers I reported before, using dirty-log-perf, but that was done on 32-bit host on a core-i3 box which was much slower than server machines. Although having 10GB dirty memory pages is a bit extreme for guests with less than 16GB memory, much larger guests, e.g. 128GB guests, may see latency longer than 1.5ms. 3. Solution GET_DIRTY_LOG time is very limited compared to other works in QEMU, so we should focus on alleviating the worst case latency first. The solution is very simple and originally suggested by Marcelo: "Conditionally reschedule when there is a contention." By this rescheduling, see the following patch, the worst case latency changed from 1.5ms to 800us for the same test. 4. TODO The patch treats kvm_vm_ioctl_get_dirty_log() only, so the write protection by kvm_mmu_slot_remove_write_access(), which is called when we enable dirty page logging, can cause the same problem. My plan is to replace it with rmap-based protection after this. Thanks, Takuya --- Takuya Yoshikawa (1): KVM: Reduce mmu_lock contention during dirty logging by cond_resched() arch/x86/include/asm/kvm_host.h | 6 +++--- arch/x86/kvm/mmu.c | 12 +++++++++--- arch/x86/kvm/x86.c | 22 +++++++++++++++++----- 3 files changed, 29 insertions(+), 11 deletions(-) -- 1.7.5.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html