Re: [PATCH 3/5] KVM: PPC: Book3S HV: Handle memory slot deletion and modification correctly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/14/2012 01:04 AM, Marcelo Tosatti wrote:
> On Mon, Aug 13, 2012 at 01:34:11PM -0300, Marcelo Tosatti wrote:
>> On Sat, Aug 11, 2012 at 10:37:54AM +1000, Paul Mackerras wrote:
>> > On Fri, Aug 10, 2012 at 03:35:53PM -0300, Marcelo Tosatti wrote:
>> > > There's no plan. I just wanted to confirm this before converting 
>> > > to per-memslot flush.
>> > > 
>> > > 1) __kvm_set_memory_region line 807:
>> > > 
>> > >                  *      - gfn_to_hva (kvm_read_guest, gfn_to_pfn)
>> > >                  *      - kvm_is_visible_gfn (mmu_check_roots)
>> > >                  */
>> > >                 kvm_arch_flush_shadow(kvm);
>> > >                 kfree(old_memslots);
>> > >         }
>> > > 
>> > > This can be converted to per-memslot flush.
>> > 
>> > Yes, that would be good.
>> > 
>> > > 2) __kvm_set_memory_region line 846:
>> > > 
>> > >         /*
>> > >          * If the new memory slot is created, we need to clear all
>> > >          * mmio sptes.
>> > >          */
>> > >         if (npages && old.base_gfn != mem->guest_phys_addr >> PAGE_SHIFT)
>> > >                 kvm_arch_flush_shadow(kvm);
>> > > 
>> > > This must flush all translations in the new and old GPA ranges:
>> > > 	a) to remove any mmio sptes that existed in the new GPA range
>> > > 	   (see ce88decffd17bf9f373cc233c for a description).
>> > 
>> > I assume that mmio sptes are ones that are created for accesses to
>> > guest physical addresses where there is no memslot.  On Book3S HV we
>> > trap those accesses (on POWER7 at least) but we don't create any
>> > hardware PTEs for them.  So I don't have anything to do here.
>> > 
>> > > 	b) to remove any translations in the old range (this is
>> > >            necessary because change of GPA base is allowed).
>> > 
>> > In fact I need to remove any translations in the old range *before*
>> > the new memslot gets committed, whereas this happens after that.
>> > This is why I was doing the flush in kvm_arch_prepare_memory_region.
>> 
>> I see. The problem with that is, given that the fault path (reader 
>> of memslots) is protected only by SRCU, the following window is open:
>> 
>> A) kvm_arch_prepare_memory_region
>> B) rcu_assign_pointer(kvm->memslots, slots)
>> C) kvm_arch_commit_memory_region
>> 
>> The window between A and B where a guest pagefault can instantiate a new
>> translation using the old memslot information (while you already removed
>> any translations in the old range).
>> 
>> 
>> Avi, Gleb, Alex, do you know why it is necessary to support change of
>> GPA base again?
>> 
>> Without taking into consideration backwards compatibility, userspace 
>> can first delete the slot and later create a new one.
> 
> Current userspace does not, and can't see why older userspace would. If
> we break it and someone complains, it can be fixed. 
> 
> So, please:
> 
> 1) Disallow change of base GPA (can do that in the "Check for overlaps" loop
> in __kvm_set_memory_region), returning EINVAL.
> 2) Introduce kvm_arch_invalidate_memslot (replacing first
> kvm_arch_flush_shadow).
> 3) Introduce kvm_arch_postcommit_memslot (replacing the 
> second kvm_arch_flush_shadow).


Is there a reason why the kernel can't do the same?  First remove the
old slot, then add the new one?

We break atomicity, but it's less likely that guests rely on it than an
old qemu not relying on gpa-changing memory update.


-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux