On Sun, Sep 13, 2009 at 06:42:49PM +0300, Avi Kivity wrote: > On 09/11/2009 01:30 AM, Marcelo Tosatti wrote: >> >>> We don't need to stop vcpus, just kick them out of guest mode to let >>> them notice the new data. SRCU does that well. >>> >> Two problems: >> >> 1. The removal of memslots/aliases and zapping of mmu (to remove any >> shadow pages with stale sp->gfn) must be atomic from the POV of the >> pagefault path. Otherwise something like this can happen: >> >> fault path set_memory_region >> > > srcu_read_lock() > >> walk_addr fetches and validates >> table_gfns >> delete memslot >> synchronize_srcu >> >> fetch creates shadow >> > > srcu_read_unlock() > >> page with nonexistant sp->gfn >> > > I think synchronize_srcu() will be deferred until the fault path is > complete (and srcu_read_unlock() runs). Copying someone who knows for > sure. Yes, synchronize_srcu() will block until srcu_read_unlock() in this scenario, assuming that the same srcu_struct is used by both. >> OR >> >> mmu_alloc_roots path set_memory_region >> > > srcu_read_lock() > >> >> delete memslot >> root_gfn = vcpu->arch.cr3<< PAGE_SHIFT >> mmu_check_root(root_gfn) synchronize_rcu >> kvm_mmu_get_page() >> > > srcu_read_unlock() > >> kvm_mmu_zap_all >> > > Ditto, srcu_read_lock() protects us. Yep! >> Accesses between kvm_mmu_get_page and kvm_mmu_zap_all window can see >> shadow pages with stale gfn. >> >> But, if you still think its worthwhile to use RCU, at least handling >> gfn_to_memslot / unalias_gfn errors _and_ adding mmu_notifier_retry >> invalidation to set_memory_region path will be necessary (so that >> gfn_to_pfn validation, in the fault path, is discarded in case >> of memslot/alias update). > > It really is worthwhile to reuse complex infrastructure instead of writing > new infrastructure. Marcelo, in your first example, is your concern that the fault path needs to detect the memslot deletion? Or that the use of sp->gfn "leaks" out of the SRCU read-side critical section? Thanx, Paul >> 2. Another complication is that memslot creation and kvm_iommu_map_pages >> are not atomic. >> >> create memslot >> synchronize_srcu >> <----- vcpu grabs gfn reference without >> iommu mapping. >> kvm_iommu_map_pages >> >> Which can be solved by changing kvm_iommu_map_pages (and new gfn_to_pfn >> helper) to use base_gfn,npages,hva information from somewhere else other >> than visible kvm->memslots (so that when the slot becomes visible its >> already iommu mapped). > > Yes. It can accept a memslots structure instead of deriving it from > kvm->memslots. Then we do a rcu_assign_pointer() to switch the tables. > >> So it appears to me using RCU introduces more complications / subtle >> details than mutual exclusion here. The new request bit which the >> original patch introduces is limited to enabling/disabling conditional >> acquision of slots_lock (calling it a "new locking protocol" is unfair) >> to improve write acquision latency. >> > > It's true that it is not a new locking protocol. But I feel it is > worthwhile to try to use rcu for this; at least it will make it easier for > newcomers (provided they understand rcu). > > > -- > error compiling committee.c: too many arguments to function > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html