Re: [RFC] KVM: x86: conditionally acquire/release slots_lock on entry/exit

"Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> · Sun, 13 Sep 2009 09:26:52 -0700

On Sun, Sep 13, 2009 at 06:42:49PM +0300, Avi Kivity wrote:
> On 09/11/2009 01:30 AM, Marcelo Tosatti wrote:
>>
>>> We don't need to stop vcpus, just kick them out of guest mode to let
>>> them notice the new data.  SRCU does that well.
>>>      
>> Two problems:
>>
>> 1. The removal of memslots/aliases and zapping of mmu (to remove any
>> shadow pages with stale sp->gfn) must be atomic from the POV of the
>> pagefault path. Otherwise something like this can happen:
>>
>> fault path					set_memory_region
>>    
>
> srcu_read_lock()
>
>> walk_addr fetches and validates
>> table_gfns
>> 						delete memslot
>> 						synchronize_srcu
>>
>> fetch creates shadow
>>    
>
> srcu_read_unlock()
>
>> page with nonexistant sp->gfn
>>    
>
> I think synchronize_srcu() will be deferred until the fault path is 
> complete (and srcu_read_unlock() runs).  Copying someone who knows for 
> sure.

Yes, synchronize_srcu() will block until srcu_read_unlock() in this
scenario, assuming that the same srcu_struct is used by both.

>> OR
>>
>> mmu_alloc_roots path				set_memory_region
>>    
>
> srcu_read_lock()
>
>> 						
>> 						delete memslot
>> root_gfn = vcpu->arch.cr3<<  PAGE_SHIFT
>> mmu_check_root(root_gfn)			synchronize_rcu
>> kvm_mmu_get_page()
>>    
>
> srcu_read_unlock()
>
>> 						kvm_mmu_zap_all
>>    
>
> Ditto, srcu_read_lock() protects us.

Yep!

>> Accesses between kvm_mmu_get_page and kvm_mmu_zap_all window can see
>> shadow pages with stale gfn.
>>
>> But, if you still think its worthwhile to use RCU, at least handling
>> gfn_to_memslot / unalias_gfn errors _and_ adding mmu_notifier_retry
>> invalidation to set_memory_region path will be necessary (so that
>> gfn_to_pfn validation, in the fault path, is discarded in case
>> of memslot/alias update).
>
> It really is worthwhile to reuse complex infrastructure instead of writing 
> new infrastructure.

Marcelo, in your first example, is your concern that the fault path
needs to detect the memslot deletion?  Or that the use of sp->gfn "leaks"
out of the SRCU read-side critical section?

							Thanx, Paul

>> 2. Another complication is that memslot creation and kvm_iommu_map_pages
>> are not atomic.
>>
>> create memslot
>> synchronize_srcu
>> 				<----- vcpu grabs gfn reference without
>> 				       iommu mapping.
>> kvm_iommu_map_pages
>>
>> Which can be solved by changing kvm_iommu_map_pages (and new gfn_to_pfn
>> helper) to use base_gfn,npages,hva information from somewhere else other
>> than visible kvm->memslots (so that when the slot becomes visible its
>> already iommu mapped).
>
> Yes.  It can accept a memslots structure instead of deriving it from 
> kvm->memslots.  Then we do a rcu_assign_pointer() to switch the tables.
>
>> So it appears to me using RCU introduces more complications / subtle
>> details than mutual exclusion here. The new request bit which the
>> original patch introduces is limited to enabling/disabling conditional
>> acquision of slots_lock (calling it a "new locking protocol" is unfair)
>> to improve write acquision latency.
>>    
>
> It's true that it is not a new locking protocol.  But I feel it is 
> worthwhile to try to use rcu for this; at least it will make it easier for 
> newcomers (provided they understand rcu).
>
>
> -- 
> error compiling committee.c: too many arguments to function
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html