On 2/20/20 9:34 AM, Ben Gardon wrote:
FWIW, we currently do this eager splitting at Google for live migration. When the log-dirty-memory flag is set on a memslot we eagerly split all pages in the slot down to 4k granularity. As Jay said, this does not cause crippling lock contention because the vCPU page faults generated by write protection / splitting can be resolved in the fast page fault path without acquiring the MMU lock. I believe +Junaid Shahid tried to upstream this approach at some point in the past, but the patch set didn't make it in. (This was before my time, so I'm hoping he has a link.) I haven't done the analysis to know if eager splitting is more or less efficient with parallel slow-path page faults, but it's definitely faster under the MMU lock.
I am not sure if we ever posted those patches upstream. Peter Feiner would know for sure. One notable difference in what we do compared to the approach outlined by Jay is that we don't rely on tdp_page_fault() to do the splitting. So we don't have to create a dummy VCPU and the specialized split function is also much faster. Thanks, Junaid