Avi Kivity <avi@xxxxxxxxxx> wrote: > That's true. But some applications do require low latency, and the > current code can impose a lot of time with the mmu spinlock held. > > The total amount of work actually increases slightly, from O(N) to O(N > log N), but since the tree is so wide, the overhead is small. > Controlling the latency can be achieved by making the user space limit the number of dirty pages to scan without hacking the core mmu code. The fact that we cannot transfer so many pages on the network at once suggests this is reasonable. With the rmap write protection method in KVM, the only thing we need is a new GET_DIRTY_LOG api which takes the [gfn_start, gfn_end] to scan, or max_write_protections optionally. I remember that someone suggested splitting the slot at KVM forum. Same effect with less effort. QEMU can also avoid unwanted page faults by using this api wisely. E.g. you can use this for "Interactivity improvements" TODO on KVM wiki, I think. Furthermore, QEMU may be able to use multiple threads for the memory copy task. Each thread has its own range of memory to copy, and does GET_DIRTY_LOG independently. This will make things easy to add further optimizations in QEMU. In summary, my impression is that the main cause of the current latency problem is not the write protection of KVM but the strategy which tries to cook the large slot in one hand. What do you think? Takuya -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html