Re: [PATCH 2/3] KVM: Emulate MSI-X table in kernel

"Michael S. Tsirkin" <mst@xxxxxxxxxx> · Fri, 25 Feb 2011 10:46:19 +0200

On Fri, Feb 25, 2011 at 02:12:42PM +0800, Sheng Yang wrote:
> On Thursday 24 February 2011 18:17:34 Michael S. Tsirkin wrote:
> > On Thu, Feb 24, 2011 at 05:44:20PM +0800, Sheng Yang wrote:
> > > On Wednesday 23 February 2011 16:45:37 Michael S. Tsirkin wrote:
> > > > On Wed, Feb 23, 2011 at 02:59:04PM +0800, Sheng Yang wrote:
> > > > > On Wednesday 23 February 2011 08:19:21 Alex Williamson wrote:
> > > > > > On Sun, 2011-01-30 at 13:11 +0800, Sheng Yang wrote:
> > > > > > > Then we can support mask bit operation of assigned devices now.
> > > > > > 
> > > > > > Looks pretty good overall.  A few comments below.  It seems like we
> > > > > > should be able to hook this into vfio with a small stub in kvm.  We
> > > > > > just need to be able to communicate disabling and enabling of
> > > > > > individual msix vectors.  For brute force, we could do this with a
> > > > > > lot of eventfds, triggered by kvm and consumed by vfio, two per
> > > > > > MSI-X vector.  Not sure if there's something smaller that could do
> > > > > > it. Thanks,
> > > > > 
> > > > > Alex, thanks for your comments. See my comments below:
> > > > > > Alex
> > > > > > 
> > > > > > > Signed-off-by: Sheng Yang <sheng@xxxxxxxxxxxxxxx>
> > > > > > > ---
> > > > > > > 
> > > > > > >  arch/x86/kvm/Makefile    |    2 +-
> > > > > > >  arch/x86/kvm/x86.c       |    8 +-
> > > > > > >  include/linux/kvm.h      |   21 ++++
> > > > > > >  include/linux/kvm_host.h |   25 ++++
> > > > > > >  virt/kvm/assigned-dev.c  |   44 +++++++
> > > > > > >  virt/kvm/kvm_main.c      |   38 ++++++-
> > > > > > >  virt/kvm/msix_mmio.c     |  286
> > > > > > >  ++++++++++++++++++++++++++++++++++++++++++++++
> > > > > > >  virt/kvm/msix_mmio.h
> > > > > > >  
> > > > > > >  |   25 ++++
> > > > > > >  
> > > > > > >  8 files changed, 442 insertions(+), 7 deletions(-)
> > > > > > >  create mode 100644 virt/kvm/msix_mmio.c
> > > > > > >  create mode 100644 virt/kvm/msix_mmio.h
> > > > > > > 
> > > > > > > diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
> > > > > > > index f15501f..3a0d851 100644
> > > > > > > --- a/arch/x86/kvm/Makefile
> > > > > > > +++ b/arch/x86/kvm/Makefile
> > > > > > > @@ -7,7 +7,7 @@ CFLAGS_vmx.o := -I.
> > > > > > > 
> > > > > > >  kvm-y			+= $(addprefix ../../../virt/kvm/, kvm_main.o 
> ioapic.o
> > > 
> > > \
> > > 
> > > > > > >  				coalesced_mmio.o irq_comm.o eventfd.o \
> > > > > > > 
> > > > > > > -				assigned-dev.o)
> > > > > > > +				assigned-dev.o msix_mmio.o)
> > > > > > > 
> > > > > > >  kvm-$(CONFIG_IOMMU_API)	+= $(addprefix ../../../virt/kvm/,
> > > > > > >  iommu.o) kvm-$(CONFIG_KVM_ASYNC_PF)	+= $(addprefix
> > > > > > >  ../../../virt/kvm/, async_pf.o)
> > > > > > > 
> > > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > > > > > index fa708c9..89bf12c 100644
> > > > > > > --- a/arch/x86/kvm/x86.c
> > > > > > > +++ b/arch/x86/kvm/x86.c
> > > > > > > @@ -1966,6 +1966,7 @@ int kvm_dev_ioctl_check_extension(long ext)
> > > > > > > 
> > > > > > >  	case KVM_CAP_X86_ROBUST_SINGLESTEP:
> > > > > > >  	case KVM_CAP_XSAVE:
> > > > > > > 
> > > > > > >  	case KVM_CAP_ASYNC_PF:
> > > > > > > +	case KVM_CAP_MSIX_MMIO:
> > > > > > >  		r = 1;
> > > > > > >  		break;
> > > > > > >  	
> > > > > > >  	case KVM_CAP_COALESCED_MMIO:
> > > > > > > @@ -3807,6 +3808,7 @@ static int
> > > > > > > emulator_write_emulated_onepage(unsigned long addr,
> > > > > > > 
> > > > > > >  					   struct kvm_vcpu *vcpu)
> > > > > > >  
> > > > > > >  {
> > > > > > >  
> > > > > > >  	gpa_t                 gpa;
> > > > > > > 
> > > > > > > +	int r;
> > > > > > > 
> > > > > > >  	gpa = kvm_mmu_gva_to_gpa_write(vcpu, addr, exception);
> > > > > > > 
> > > > > > > @@ -3822,14 +3824,16 @@ static int
> > > > > > > emulator_write_emulated_onepage(unsigned long addr,
> > > > > > > 
> > > > > > >  mmio:
> > > > > > >  	trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, bytes, gpa, *(u64 *)val);
> > > > > > > 
> > > > > > > +	r = vcpu_mmio_write(vcpu, gpa, bytes, val);
> > > > > > > 
> > > > > > >  	/*
> > > > > > >  	
> > > > > > >  	 * Is this MMIO handled locally?
> > > > > > >  	 */
> > > > > > > 
> > > > > > > -	if (!vcpu_mmio_write(vcpu, gpa, bytes, val))
> > > > > > > +	if (!r)
> > > > > > > 
> > > > > > >  		return X86EMUL_CONTINUE;
> > > > > > >  	
> > > > > > >  	vcpu->mmio_needed = 1;
> > > > > > > 
> > > > > > > -	vcpu->run->exit_reason = KVM_EXIT_MMIO;
> > > > > > > +	vcpu->run->exit_reason = (r == -ENOTSYNC) ?
> > > > > > > +		KVM_EXIT_MSIX_ROUTING_UPDATE : KVM_EXIT_MMIO;
> > > > 
> > > > This use of -ENOTSYNC is IMO confusing.
> > > > How about we make vcpu_mmio_write return the positive exit reason?
> > > > Negative value will mean an error.
> > > 
> > > In fact currently nagative value means something more need to be done,
> > > the same as MMIO exit.
> > 
> > So it would be
> > 	if (!r)
> >   		return X86EMUL_CONTINUE;
> > 
> > 	vcpu->run->exit_reason = r;
> > 
> > > Now I think we can keep it, or update them all later.
> > 
> > The way to do this would be
> > 1. patch to return KVM_EXIT_MMIO on mmio
> > 2. your patch that returns KVM_EXIT_MSIX_ROUTING_UPDATE on top
> 
> It's not that straightforward. In most condition, the reason vcpu_mmio_write() < 0 
> because KVM itself unable to complete the request. That's quite straightforward. 
> But each handler in the chain can't decided it would be "KVM_EXIT_MMIO", they can 
> only know when all of them fail to handle the accessing.
> 

Well, this just violates the standard API for return value.

The standard semantics are:
- negative - error
- positive - not an error

So you can just return a positive KVM_EXIT_MSIX_ROUTING_UPDATE
instead of ENOTSYNC. This way you do not need to rework all code
and you don't spread MSIX all over kvm.

You just do:
	(r > 0) ? r : KVM_EXIT_MMIO

> I am not sure if we like every single handler said "I want KVM_EXIT_MMIO" instead 
> of a error return.
> We can discuss more on this, but since it's not API/ABI change, 
> I think we can get the patch in first.
> --
> regards
> Yang, Sheng
> 
> > 
> > > --
> > > regards
> > > Yang, Sheng
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html