On 05/24/2012 04:25 PM, Avi Kivity wrote: > On 05/24/2012 09:26 AM, Xiao Guangrong wrote: >> On 05/23/2012 07:34 PM, Avi Kivity wrote: >> >> >>>> static bool spte_has_volatile_bits(u64 spte) >>>> { >>>> + /* >>>> + * Always atomicly update spte if it can be updated >>>> + * out of mmu-lock. >>>> + */ >>>> + if (spte_can_lockless_update(spte)) >>>> + return true; >>> >>> >>> This is a really subtle point, but is it really needed? >>> >>> Lockless spte updates should always set the dirty and accessed bits, so >>> we won't be overwriting any volatile bits there. >>> >> >> >> Avi, >> >> Currently, The spte update/clear paths in mmu-lock think the "Dirty bit" is >> not volatile if the spte is readonly. Then the "Dirty bit" caused by >> lockless update can be lost. >> > > Maybe it's better to change that. In fact, changing > > if ((spte & shadow_accessed_mask) && > (!is_writable_pte(spte) || (spte & shadow_dirty_mask))) > return false; > > to > > if (~spte & (shadow_accessed_mask | shadow_dirty_mask)) > return false; > Okay, i like this way. > is almost the same thing - we miss the case where the page is COW or > shadowed though. > > If we release the page as dirty, as below, perhaps the whole thing > doesn't matter; the mm must drop spte.w (or spte.d) before it needs to > access spte.d again. > > >> And, for tlb flush: >> >> | * If we overwrite a writable spte with a read-only one we >> | * should flush remote TLBs. Otherwise rmap_write_protect >> | * will find a read-only spte, even though the writable spte >> | * might be cached on a CPU's TLB. >> | */ >> | if (is_writable_pte(entry) && !is_writable_pte(*sptep)) >> | kvm_flush_remote_tlbs(vcpu->kvm); >> >> Atomically update spte can help us to get a stable is_writable_pte(). > > Why is it unstable? mmu_set_spte() before cleared SPTE_MMU_WRITEABLE, so > the lockless path will keep its hands off *spte. > Since dirty-log path does not clear SPTE_MMU_WRITEABLE bit, so in the system, we have this kind of spte which is readonly but SPTE_MMU_WRITEABLE is set. In mmu_set_spte(), we may read a read-only spte (which indicates the TLb need not be flushed), but it can be marked writeable by fast page fault, then the TLB is dirty. If you do not like the way in this patch, we can change it to: if (spte_can_be_wriable(entry) && !is_writable_pte(*sptep)) kvm_flush_remote_tlbs(vcpu->kvm); And actually, this kind of TLB flush can be delay until page table protection happen, we can simply mask tlb dirty after your patchset of "flush tlb out of mmu lock" -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html