On 2021/9/4 00:06, Sean Christopherson wrote:
trace_get_page: diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 50ade6450ace..2ff123ec0d64 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -704,6 +704,9 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, access = gw->pt_access[it.level - 2]; sp = kvm_mmu_get_page(vcpu, table_gfn, fault->addr, it.level-1, false, access); + if (sp->unsync_children && + mmu_sync_children(vcpu, sp, false)) + return RET_PF_RETRY;
It was like my first (unsent) fix. Just return RET_PF_RETRY when break. And then I thought that it'd be better to retry fetching directly rather than retry guest when the conditions are still valid/unchanged to avoid all the next guest page walking and GUP(). Although the code does not check all conditions such as interrupt event pending. (we can add that too) I think it is a good design to allow break mmu_lock when mmu is handling heavy work.
} /* --