On Thu, Nov 29, 2012 at 07:16:50AM +0800, Xiao Guangrong wrote: > On 11/29/2012 06:40 AM, Xiao Guangrong wrote: > > On 11/29/2012 05:57 AM, Marcelo Tosatti wrote: > >> On Wed, Nov 28, 2012 at 10:59:35PM +0800, Xiao Guangrong wrote: > >>> On 11/28/2012 10:12 PM, Gleb Natapov wrote: > >>>> On Tue, Nov 27, 2012 at 11:30:24AM +0800, Xiao Guangrong wrote: > >>>>> On 11/27/2012 06:41 AM, Marcelo Tosatti wrote: > >>>>> > >>>>>>> > >>>>>>> - return false; > >>>>>>> +again: > >>>>>>> + page_fault_count = ACCESS_ONCE(vcpu->kvm->arch.page_fault_count); > >>>>>>> + > >>>>>>> + /* > >>>>>>> + * if emulation was due to access to shadowed page table > >>>>>>> + * and it failed try to unshadow page and re-enter the > >>>>>>> + * guest to let CPU execute the instruction. > >>>>>>> + */ > >>>>>>> + kvm_mmu_unprotect_page(vcpu->kvm, gpa_to_gfn(gpa)); > >>>>>>> + emulate = vcpu->arch.mmu.page_fault(vcpu, cr3, PFERR_WRITE_MASK, false); > >>>>>> > >>>>>> Can you explain what is the objective here? > >>>>>> > >>>>> > >>>>> Sure. :) > >>>>> > >>>>> The instruction emulation is caused by fault access on cr3. After unprotect > >>>>> the target page, we call vcpu->arch.mmu.page_fault to fix the mapping of cr3. > >>>>> if it return 1, mmu can not fix the mapping, we should report the error, > >>>>> otherwise it is good to return to guest and let it re-execute the instruction > >>>>> again. > >>>>> > >>>>> page_fault_count is used to avoid the race on other vcpus, since after we > >>>>> unprotect the target page, other cpu can enter page fault path and let the > >>>>> page be write-protected again. > >>>>> > >>>>> This way can help us to detect all the case that mmu can not be fixed. > >>>>> > >>>> Can you write this in a comment above vcpu->arch.mmu.page_fault()? > >>> > >>> Okay, if Marcelo does not object this way. :) > >> > >> I do object, since it is possible to detect precisely the condition by > >> storing which gfns have been cached. > >> > >> Then, Xiao, you need a way to handle large read-only sptes. > > > > Sorry, Marcelo, i am still confused why read-only sptes can not work > > under this patch? > > > > The code after read-only large spte is is: > > > > + if ((level > PT_PAGE_TABLE_LEVEL && > > + has_wrprotected_page(vcpu->kvm, gfn, level)) || > > + mmu_need_write_protect(vcpu, gfn, can_unsync)) { > > pgprintk("%s: found shadow page for %llx, marking ro\n", > > __func__, gfn); > > ret = 1; > > > > It return 1, then reexecute_instruction return 0. It is the same as without > > readonly large-spte. > > Ah, wait, There is a case, the large page located at 0-2M, the 0-4K is used as a > page-table (e.g. PDE), and the guest want to write the memory located at 5K which > should be freely written. This patch can return 0 for both current code and readonly > large spte. Yes, should remove the read-only large spte if any write to 0-2M fails (said 'unshadow' in the previous email but the correct is 'remove large spte in range'). > I need to think it more. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html