Re: [PATCH 3/5] KVM: PPC: Book3S HV: Handle memory slot deletion and modification correctly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 15, 2012 at 12:26:05PM +0300, Avi Kivity wrote:
> On 08/14/2012 01:04 AM, Marcelo Tosatti wrote:
> > On Mon, Aug 13, 2012 at 01:34:11PM -0300, Marcelo Tosatti wrote:
> >> On Sat, Aug 11, 2012 at 10:37:54AM +1000, Paul Mackerras wrote:
> >> > On Fri, Aug 10, 2012 at 03:35:53PM -0300, Marcelo Tosatti wrote:
> >> > > There's no plan. I just wanted to confirm this before converting 
> >> > > to per-memslot flush.
> >> > > 
> >> > > 1) __kvm_set_memory_region line 807:
> >> > > 
> >> > >                  *      - gfn_to_hva (kvm_read_guest, gfn_to_pfn)
> >> > >                  *      - kvm_is_visible_gfn (mmu_check_roots)
> >> > >                  */
> >> > >                 kvm_arch_flush_shadow(kvm);
> >> > >                 kfree(old_memslots);
> >> > >         }
> >> > > 
> >> > > This can be converted to per-memslot flush.
> >> > 
> >> > Yes, that would be good.
> >> > 
> >> > > 2) __kvm_set_memory_region line 846:
> >> > > 
> >> > >         /*
> >> > >          * If the new memory slot is created, we need to clear all
> >> > >          * mmio sptes.
> >> > >          */
> >> > >         if (npages && old.base_gfn != mem->guest_phys_addr >> PAGE_SHIFT)
> >> > >                 kvm_arch_flush_shadow(kvm);
> >> > > 
> >> > > This must flush all translations in the new and old GPA ranges:
> >> > > 	a) to remove any mmio sptes that existed in the new GPA range
> >> > > 	   (see ce88decffd17bf9f373cc233c for a description).
> >> > 
> >> > I assume that mmio sptes are ones that are created for accesses to
> >> > guest physical addresses where there is no memslot.  On Book3S HV we
> >> > trap those accesses (on POWER7 at least) but we don't create any
> >> > hardware PTEs for them.  So I don't have anything to do here.
> >> > 
> >> > > 	b) to remove any translations in the old range (this is
> >> > >            necessary because change of GPA base is allowed).
> >> > 
> >> > In fact I need to remove any translations in the old range *before*
> >> > the new memslot gets committed, whereas this happens after that.
> >> > This is why I was doing the flush in kvm_arch_prepare_memory_region.
> >> 
> >> I see. The problem with that is, given that the fault path (reader 
> >> of memslots) is protected only by SRCU, the following window is open:
> >> 
> >> A) kvm_arch_prepare_memory_region
> >> B) rcu_assign_pointer(kvm->memslots, slots)
> >> C) kvm_arch_commit_memory_region
> >> 
> >> The window between A and B where a guest pagefault can instantiate a new
> >> translation using the old memslot information (while you already removed
> >> any translations in the old range).
> >> 
> >> 
> >> Avi, Gleb, Alex, do you know why it is necessary to support change of
> >> GPA base again?
> >> 
> >> Without taking into consideration backwards compatibility, userspace 
> >> can first delete the slot and later create a new one.
> > 
> > Current userspace does not, and can't see why older userspace would. If
> > we break it and someone complains, it can be fixed. 
> > 
> > So, please:
> > 
> > 1) Disallow change of base GPA (can do that in the "Check for overlaps" loop
> > in __kvm_set_memory_region), returning EINVAL.
> > 2) Introduce kvm_arch_invalidate_memslot (replacing first
> > kvm_arch_flush_shadow).
> > 3) Introduce kvm_arch_postcommit_memslot (replacing the 
> > second kvm_arch_flush_shadow).
> 
> 
> Is there a reason why the kernel can't do the same?  First remove the
> old slot, then add the new one?
> 
> We break atomicity, but it's less likely that guests rely on it than an
> old qemu not relying on gpa-changing memory update.

The guest should not expect memory accesses to an address
to behave sanely while changing a BAR anyway.

Yes, compatibility for change of GPA base can be done in the
kernel. I can look into it next week if nobody has done so at
that point.

Thanks Avi.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux