On 05/03/2012 04:23 PM, Xiao Guangrong wrote: > On 05/03/2012 07:22 PM, Avi Kivity wrote: > > > Currently we flush the TLB while holding mmu_lock. This > > increases the lock hold time by the IPI round-trip time, increasing > > contention, and makes dropping the lock (for latency reasons) harder. > > > > This patch changes TLB management to be usable locklessly, introducing > > the following APIs: > > > > kvm_mark_tlb_dirty() - mark the TLB as containing stale entries > > kvm_cond_flush_remote_tlbs() - flush the TLB if it was marked as > > dirty > > > > These APIs can be used without holding mmu_lock (though if the TLB > > became stale due to shadow page table modifications, typically it > > will need to be called with the lock held to prevent other threads > > from seeing the modified page tables with the TLB unmarked and unflushed)/ > > > > Signed-off-by: Avi Kivity <avi@xxxxxxxxxx> > > --- > > Documentation/virtual/kvm/locking.txt | 14 ++++++++++++++ > > arch/x86/kvm/paging_tmpl.h | 4 ++-- > > include/linux/kvm_host.h | 22 +++++++++++++++++++++- > > virt/kvm/kvm_main.c | 29 ++++++++++++++++++++--------- > > 4 files changed, 57 insertions(+), 12 deletions(-) > > > > diff --git a/Documentation/virtual/kvm/locking.txt b/Documentation/virtual/kvm/locking.txt > > index 3b4cd3b..f6c90479 100644 > > --- a/Documentation/virtual/kvm/locking.txt > > +++ b/Documentation/virtual/kvm/locking.txt > > @@ -23,3 +23,17 @@ Arch: x86 > > Protects: - kvm_arch::{last_tsc_write,last_tsc_nsec,last_tsc_offset} > > - tsc offset in vmcb > > Comment: 'raw' because updating the tsc offsets must not be preempted. > > + > > +3. TLB control > > +-------------- > > + > > +The following APIs should be used for TLB control: > > + > > + - kvm_mark_tlb_dirty() - indicates that the TLB is out of sync wrt > > + either guest or host page tables. > > + - kvm_flush_remote_tlbs() - unconditionally flush the tlbs > > + - kvm_cond_flush_remote_tlbs() - flush the TLBs if previously marked > > + > > +These may be used without mmu_lock, though kvm_mark_tlb_dirty() needs to be > > +used while holding mmu_lock if it is called due to host page table changes > > +(contrast to guest page table changes). > > > In these patches, it seems that kvm_mark_tlb_dirty is always used > under the protection of mmu-lock, yes? Correct. It's possible we'll find a use outside mmu_lock, but this isn't likely. > If both kvm_mark_tlb_dirty and kvm_cond_flush_remote_tlbs are use > out of mmu-lock, i think we can use kvm_flush_remote_tlbs instead. > > If it is so, dirtied_count/flushed_count need not be atomic. But we always mark with mmu_lock held. > > And, it seems there is a bug: > > VCPU 0 VCPU 1 > > hold mmu-lock > zap spte which points to 'gfn' > mark_tlb_dirty > release mmu-lock > hold mmu-lock > rmap_write-protect: > see no spte pointing to gfn > tlb did not be flushed > release mmu-lock > > write gfn by guest > OOPS!!! > > kvm_cond_flush_remote_tlbs > > We need call kvm_cond_flush_remote_tlbs in rmap_write-protect > unconditionally? Yes, that's the point. We change spin_lock(mmu_lock) conditionally flush the tlb spin_unlock(mmu_lock) to spin_lock(mmu_lock) conditionally mark the tlb as dirty spin_unlock(mmu_lock) kvm_cond_flush_remote_tlbs() but that means the entire codebase has to be converted. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html