On 08/19/2014 12:31 PM, David Matlack wrote: > On Mon, Aug 18, 2014 at 8:50 PM, Xiao Guangrong > <xiaoguangrong@xxxxxxxxxxxxxxxxxx> wrote: >> On 08/19/2014 05:15 AM, David Matlack wrote: >>> On Mon, Aug 18, 2014 at 12:56 PM, Xiao Guangrong >>> <xiaoguangrong.eric@xxxxxxxxx> wrote: >>>> @@ -287,9 +293,15 @@ static bool set_mmio_spte(struct kvm *kvm, u64 *sptep, gfn_t gfn, >>>> >>>> static bool check_mmio_spte(struct kvm *kvm, u64 spte) >>>> { >>>> + struct kvm_memslots *slots = kvm_memslots(kvm); >>>> unsigned int kvm_gen, spte_gen; >>>> >>>> - kvm_gen = kvm_current_mmio_generation(kvm); >>>> + if (slots->updated) >>>> + return false; >>>> + >>>> + smp_rmb(); >>>> + >>>> + kvm_gen = __kvm_current_mmio_generation(slots); >>>> spte_gen = get_mmio_spte_generation(spte); >>>> >>> >>> What does this fix? Case 2 can still happen. (Case 2 is unavoidable unless we >>> block during memslot updates, which I don't think we should :). >> >> This exactly fixes case 2, slots->updated just acts as the "low bit" >> but avoid generation number wrap-around and trick handling of the number. >> More details please see below. >> >>> >>>> trace_check_mmio_spte(spte, kvm_gen, spte_gen); >>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c >>>> index 4b6c01b..1d4e78f 100644 >>>> --- a/virt/kvm/kvm_main.c >>>> +++ b/virt/kvm/kvm_main.c >>>> @@ -96,7 +96,7 @@ static void hardware_disable_all(void); >>>> >>>> static void kvm_io_bus_destroy(struct kvm_io_bus *bus); >>>> static void update_memslots(struct kvm_memslots *slots, >>>> - struct kvm_memory_slot *new, u64 last_generation); >>>> + struct kvm_memory_slot *new); >>>> >>>> static void kvm_release_pfn_dirty(pfn_t pfn); >>>> static void mark_page_dirty_in_slot(struct kvm *kvm, >>>> @@ -685,8 +685,7 @@ static void sort_memslots(struct kvm_memslots *slots) >>>> } >>>> >>>> static void update_memslots(struct kvm_memslots *slots, >>>> - struct kvm_memory_slot *new, >>>> - u64 last_generation) >>>> + struct kvm_memory_slot *new) >>>> { >>>> if (new) { >>>> int id = new->id; >>>> @@ -697,8 +696,6 @@ static void update_memslots(struct kvm_memslots *slots, >>>> if (new->npages != npages) >>>> sort_memslots(slots); >>>> } >>>> - >>>> - slots->generation = last_generation + 1; >>>> } >>>> >>>> static int check_memory_region_flags(struct kvm_userspace_memory_region *mem) >>>> @@ -720,10 +717,17 @@ static struct kvm_memslots *install_new_memslots(struct kvm *kvm, >>>> { >>>> struct kvm_memslots *old_memslots = kvm->memslots; >>>> >>>> - update_memslots(slots, new, kvm->memslots->generation); >>>> + /* ensure generation number is always increased. */ >>>> + slots->updated = true; >>>> + slots->generation = old_memslots->generation; >>>> + update_memslots(slots, new); >>>> rcu_assign_pointer(kvm->memslots, slots); >>>> synchronize_srcu_expedited(&kvm->srcu); >>>> >>>> + slots->generation++; >>>> + smp_wmb(); >>>> + slots->updated = false; >>>> + >>>> kvm_arch_memslots_updated(kvm); >>>> >>>> return old_memslots; >>>> >>> >>> This is effectively the same as the first approach. >>> >>> I just realized how simple Paolo's idea is. I think it can be a one line >>> patch (without comments): >>> >>> [...] >>> update_memslots(slots, new, kvm->memslots->generation); >>> rcu_assign_pointer(kvm->memslots, slots); >>> synchronize_srcu_expedited(&kvm->srcu); >>> + slots->generation++; >>> >>> kvm_arch_memslots_updated(kvm); >>> [...] >> >> Really? Unfortunately no. :) >> >> See this scenario: >> >> CPU 0 CPU 1 >> ioctl registering a new memslot which >> contains GPA: >> page-fault handler: >> see it'a mmio access on GPA; >> >> assign the new memslots with generation number increased >> cache the generation-number into spte; >> fix the access and comeback to guest; >> SRCU-sync >> page-fault again and check the spte is a valid mmio-spte(*) >> generation-number++; >> return to userspace; >> do mmio-emulation and inject mmio-exit; >> >> !!! userspace receives a unexpected mmio-exit, that is case 2 i exactly >> said in the last mail. >> >> >> Note in the step *, my approach detects the invalid generation-number which >> will invalidate the mmio spte properly . > > Sorry I didn't explain myself very well: Since we can get a single wrong > mmio exit no matter what, it has to be handled in userspace. So my point > was, it doesn't really help to fix that one very specific way that it can > happen, because it can just happen in other ways. (E.g. update memslots > occurs after is_noslot_pfn() and before mmio exit). > > But it looks like you basically said the same thing earlier, so I think > we're on the same page. > Yes, that is what i try to explain in previous mails. :( > The single line patch I suggested was only intended to fix the "forever > incorrectly exit mmio". My patch also fixes this case and that does not doubly increase the number. I think this is the better one. -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html