On 08/19/2014 05:15 AM, David Matlack wrote: > On Mon, Aug 18, 2014 at 12:56 PM, Xiao Guangrong > <xiaoguangrong.eric@xxxxxxxxx> wrote: >> @@ -287,9 +293,15 @@ static bool set_mmio_spte(struct kvm *kvm, u64 *sptep, gfn_t gfn, >> >> static bool check_mmio_spte(struct kvm *kvm, u64 spte) >> { >> + struct kvm_memslots *slots = kvm_memslots(kvm); >> unsigned int kvm_gen, spte_gen; >> >> - kvm_gen = kvm_current_mmio_generation(kvm); >> + if (slots->updated) >> + return false; >> + >> + smp_rmb(); >> + >> + kvm_gen = __kvm_current_mmio_generation(slots); >> spte_gen = get_mmio_spte_generation(spte); >> > > What does this fix? Case 2 can still happen. (Case 2 is unavoidable unless we > block during memslot updates, which I don't think we should :). This exactly fixes case 2, slots->updated just acts as the "low bit" but avoid generation number wrap-around and trick handling of the number. More details please see below. > >> trace_check_mmio_spte(spte, kvm_gen, spte_gen); >> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c >> index 4b6c01b..1d4e78f 100644 >> --- a/virt/kvm/kvm_main.c >> +++ b/virt/kvm/kvm_main.c >> @@ -96,7 +96,7 @@ static void hardware_disable_all(void); >> >> static void kvm_io_bus_destroy(struct kvm_io_bus *bus); >> static void update_memslots(struct kvm_memslots *slots, >> - struct kvm_memory_slot *new, u64 last_generation); >> + struct kvm_memory_slot *new); >> >> static void kvm_release_pfn_dirty(pfn_t pfn); >> static void mark_page_dirty_in_slot(struct kvm *kvm, >> @@ -685,8 +685,7 @@ static void sort_memslots(struct kvm_memslots *slots) >> } >> >> static void update_memslots(struct kvm_memslots *slots, >> - struct kvm_memory_slot *new, >> - u64 last_generation) >> + struct kvm_memory_slot *new) >> { >> if (new) { >> int id = new->id; >> @@ -697,8 +696,6 @@ static void update_memslots(struct kvm_memslots *slots, >> if (new->npages != npages) >> sort_memslots(slots); >> } >> - >> - slots->generation = last_generation + 1; >> } >> >> static int check_memory_region_flags(struct kvm_userspace_memory_region *mem) >> @@ -720,10 +717,17 @@ static struct kvm_memslots *install_new_memslots(struct kvm *kvm, >> { >> struct kvm_memslots *old_memslots = kvm->memslots; >> >> - update_memslots(slots, new, kvm->memslots->generation); >> + /* ensure generation number is always increased. */ >> + slots->updated = true; >> + slots->generation = old_memslots->generation; >> + update_memslots(slots, new); >> rcu_assign_pointer(kvm->memslots, slots); >> synchronize_srcu_expedited(&kvm->srcu); >> >> + slots->generation++; >> + smp_wmb(); >> + slots->updated = false; >> + >> kvm_arch_memslots_updated(kvm); >> >> return old_memslots; >> > > This is effectively the same as the first approach. > > I just realized how simple Paolo's idea is. I think it can be a one line > patch (without comments): > > [...] > update_memslots(slots, new, kvm->memslots->generation); > rcu_assign_pointer(kvm->memslots, slots); > synchronize_srcu_expedited(&kvm->srcu); > + slots->generation++; > > kvm_arch_memslots_updated(kvm); > [...] Really? Unfortunately no. :) See this scenario: CPU 0 CPU 1 ioctl registering a new memslot which contains GPA: page-fault handler: see it'a mmio access on GPA; assign the new memslots with generation number increased cache the generation-number into spte; fix the access and comeback to guest; SRCU-sync page-fault again and check the spte is a valid mmio-spte(*) generation-number++; return to userspace; do mmio-emulation and inject mmio-exit; !!! userspace receives a unexpected mmio-exit, that is case 2 i exactly said in the last mail. Note in the step *, my approach detects the invalid generation-number which will invalidate the mmio spte properly . -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html