On Tue, May 08, 2012 at 03:39:43PM +0300, Avi Kivity wrote: > On 05/08/2012 05:25 AM, Marcelo Tosatti wrote: > > > > > > > > We need call kvm_cond_flush_remote_tlbs in rmap_write-protect > > > > unconditionally? > > > > > > Yes, that's the point. We change > > > > > > spin_lock(mmu_lock) > > > conditionally flush the tlb > > > spin_unlock(mmu_lock) > > > > > > to > > > > > > spin_lock(mmu_lock) > > > conditionally mark the tlb as dirty > > > spin_unlock(mmu_lock) > > > kvm_cond_flush_remote_tlbs() > > > > > > but that means the entire codebase has to be converted. > > > > Is there any other site which expects sptes and TLBs to be in sync, > > other than rmap_write_protect? > > I wouldn't say rmap_write_protect() expects sptes and TLBs to be in > sync. Rather its callers. > > > Please convert the flush_remote_tlbs at the end of set_spte (RW->RO) to > > mark_dirty. > > I'd like to take an incremental approach, since there are many paths. I > don't have a concrete plan though. > > > Looks good in general (patchset is incomplete though). One thing that > > is annoying is that there is no guarantee of progress for flushed_count > > increment (it can, in theory, always race with a mark_dirty). But since > > that would be only a performance, and not correctness, aspect, it is > > fine. > > We don't have a while () look in cond_flush(), so it won't be slowed > down by the race; whoever caused the race will have to flush on their > own, but they do that anyway now. > > We can regress if vcpu 0 marks the tlb as dirty, and then vcpu 0 and 1 > simultaneously flush, even though vcpu 1 did nothing to deserve it. I > don't see a way around it except to hope its a rare event. > > > It has the advantage that it requires code to explicitly document where > > the TLB must be flushed and the sites which expect sptes to be in sync > > with TLBs. > > I'm looking for an idea of how to make the flush in those cases not hold > mmu_lock. You can't easily for rmap_write_protect, must check that the state is unchanged (write protect operation is still relevant). Current patchset (with corrections to comments by Xiao) is good enough already IMO. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html