On 08/29/2013 07:33 PM, Xiao Guangrong wrote: > On 08/29/2013 05:31 PM, Gleb Natapov wrote: >> On Thu, Aug 29, 2013 at 02:50:51PM +0800, Xiao Guangrong wrote: >>> After more thinking, I still think rcu_assign_pointer() is unneeded when a entry >>> is removed. The remove-API does not care the order between unlink the entry and >>> the changes to its fields. It is the caller's responsibility: >>> - in the case of rcuhlist, the caller uses call_rcu()/synchronize_rcu(), etc to >>> enforce all lookups exit and the later change on that entry is invisible to the >>> lookups. >>> >>> - In the case of rculist_nulls, it seems refcounter is used to guarantee the order >>> (see the example from Documentation/RCU/rculist_nulls.txt). >>> >>> - In our case, we allow the lookup to see the deleted desc even if it is in slab cache >>> or its is initialized or it is re-added. >>> >> BTW is it a good idea? We can access deleted desc while it is allocated >> and initialized to zero by kmem_cache_zalloc(), are we sure we cannot >> see partially initialized desc->sptes[] entry? On related note what about >> 32 bit systems, they do not have atomic access to desc->sptes[]. Ah... wait. desc is a array of pointers: struct pte_list_desc { u64 *sptes[PTE_LIST_EXT]; struct pte_list_desc *more; }; assigning a pointer is aways aotomic, but we should carefully initialize it as you said. I will introduce a constructor for desc slab cache which initialize the struct like this: for (i = 0; i < PTE_LIST_EXT; i++) desc->sptes[i] = NULL; It is okay. > > Good eyes. This is a bug here. > > It seems we do not have a good to fix this. How disable this optimization on > 32 bit host, small changes: > > static inline void kvm_mmu_rcu_free_page_begin(struct kvm *kvm) > { > +#ifdef CONFIG_X86_64 > rcu_read_lock(); > > kvm->arch.rcu_free_shadow_page = true; > /* Set the indicator before access shadow page. */ > smp_mb(); > +#else > + spin_lock(kvm->mmu_lock); > +#endif > } > > static inline void kvm_mmu_rcu_free_page_end(struct kvm *kvm) > { > +#ifdef CONFIG_X86_64 > /* Make sure that access shadow page has finished. */ > smp_mb(); > kvm->arch.rcu_free_shadow_page = false; > > rcu_read_unlock(); > +#else > + spin_unlock(kvm->mmu_lock); > +#endif > } > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html