Re: [RFC PATCH v4 1/3] Mediated device Core driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 7 Jun 2016 20:48:42 -0700
Neo Jia <cjia@xxxxxxxxxx> wrote:

> On Wed, Jun 08, 2016 at 11:18:42AM +0800, Dong Jia wrote:
> > On Tue, 7 Jun 2016 19:39:21 -0600
> > Alex Williamson <alex.williamson@xxxxxxxxxx> wrote:
> > 
> > > On Wed, 8 Jun 2016 01:18:42 +0000
> > > "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
> > > 
> > > > > From: Alex Williamson [mailto:alex.williamson@xxxxxxxxxx]
> > > > > Sent: Wednesday, June 08, 2016 6:42 AM
> > > > > 
> > > > > On Tue, 7 Jun 2016 03:03:32 +0000
> > > > > "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
> > > > >   
> > > > > > > From: Alex Williamson [mailto:alex.williamson@xxxxxxxxxx]
> > > > > > > Sent: Tuesday, June 07, 2016 3:31 AM
> > > > > > >
> > > > > > > On Mon, 6 Jun 2016 10:44:25 -0700
> > > > > > > Neo Jia <cjia@xxxxxxxxxx> wrote:
> > > > > > >  
> > > > > > > > On Mon, Jun 06, 2016 at 04:29:11PM +0800, Dong Jia wrote:  
> > > > > > > > > On Sun, 5 Jun 2016 23:27:42 -0700
> > > > > > > > > Neo Jia <cjia@xxxxxxxxxx> wrote:
> > > > > > > > >
> > > > > > > > > 2. VFIO_DEVICE_CCW_CMD_REQUEST
> > > > > > > > > This intends to handle an intercepted channel I/O instruction. It
> > > > > > > > > basically need to do the following thing:  
> > > > > > > >
> > > > > > > > May I ask how and when QEMU knows that he needs to issue such VFIO ioctl at
> > > > > > > > first place?  
> > > > > > >
> > > > > > > Yep, this is my question as well.  It sounds a bit like there's an
> > > > > > > emulated device in QEMU that's trying to tell the mediated device when
> > > > > > > to start an operation when we probably should be passing through
> > > > > > > whatever i/o operations indicate that status directly to the mediated
> > > > > > > device. Thanks,
> > > > > > >
> > > > > > > Alex  
> > > > > >
> > > > > > Below is copied from Dong's earlier post which said clear that
> > > > > > a guest cmd submission will trigger the whole flow:
> > > > > >
> > > > > > ----
> > > > > > Explanation:
> > > > > > Q1-Q4: Qemu side process.
> > > > > > K1-K6: Kernel side process.
> > > > > >
> > > > > > Q1. Intercept a ssch instruction.
> > > > > > Q2. Translate the guest ccw program to a user space ccw program
> > > > > >     (u_ccwchain).
> > > > > > Q3. Call VFIO_DEVICE_CCW_CMD_REQUEST (u_ccwchain, orb, irb).
> > > > > >     K1. Copy from u_ccwchain to kernel (k_ccwchain).
> > > > > >     K2. Translate the user space ccw program to a kernel space ccw
> > > > > >         program, which becomes runnable for a real device.
> > > > > >     K3. With the necessary information contained in the orb passed in
> > > > > >         by Qemu, issue the k_ccwchain to the device, and wait event q
> > > > > >         for the I/O result.
> > > > > >     K4. Interrupt handler gets the I/O result, and wakes up the wait q.
> > > > > >     K5. CMD_REQUEST ioctl gets the I/O result, and uses the result to
> > > > > >         update the user space irb.
> > > > > >     K6. Copy irb and scsw back to user space.
> > > > > > Q4. Update the irb for the guest.
> > > > > > ----  
> > > > > 
> > > > > Right, but this was the pre-mediated device approach, now we no longer
> > > > > need step Q2 so we really only need Q1 and therefore Q3 to exist in
> > > > > QEMU if those are operations that are not visible to the mediated
> > > > > device; which they very well might be, since it's described as an
> > > > > instruction rather than an i/o operation.  It's not terrible if that's
> > > > > the case, vfio-pci has its own ioctl for doing a hot reset.  
> > Dear Alex, Kevin and Neo,
> > 
> > 'ssch' is a privileged I/O instruction, which should be finally issued
> > to the dedicated subchannel of the physical device.
> > 
> > BTW, I did remove step Q2 with all of the user-space translation code,
> > according to your comments in another thread.
> > 
> > > > 
> > > > 
> > > > >   
> > > > > > My understanding is that such thing belongs to how device is mediated
> > > > > > (so device driver specific), instead of something to be abstracted in
> > > > > > VFIO which manages resource but doesn't care how resource is used.
> > > > > >
> > > > > > Actually we have same requirement in vGPU case, that a guest driver
> > > > > > needs submit GPU commands through some MMIO register. vGPU device
> > > > > > model will intercept the submission request (in its own way), do its
> > > > > > necessary scan/audit to ensure correctness/security, and then submit
> > > > > > to physical GPU through vendor specific interface.
> > > > > >
> > > > > > No difference with channel I/O here.  
> > > > > 
> > > > > Well, if the GPU command is submitted through an MMIO register, is that
> > > > > MMIO register part of the mediated device?  If so, could the mediated
> > > > > device recognize the command and do the scan/audit itself?  QEMU must
> > > > > not be the point at which mediation occurs for security purposes, QEMU
> > > > > is userspace and userspace is not to be trusted.  I'm still open to
> > > > > ioctls where it makes sense, as above, we have PCI specific ioctls and
> > > > > already, but we need to evaluate each one, why it needs to exist, and
> > > > > whether we can skip it if the mediated device can trigger the action on
> > > > > its own.  After all, that's why we're using the vfio api, so we can
> > > > > re-use much of the existing infrastructure, especially for a vGPU that
> > > > > exposes itself as a PCI device.  Thanks,
> > > > >   
> > > > 
> > > > My point is that a guest submission on vGPU is just a normal trapped 
> > > > register write, which is forwarded from Qemu to VFIO through pwrite 
> > > > interface and then hit mediated vGPU device. The mediated device
> > > > will recognize this register write as a submission request and then do
> > > > necessary scan (looks we are saying same thing) and then submit to
> > > > physical device driver. If loading ccw cmds on channel i/o are also 
> > > > through some I/O registers, it can be implemented same way w/o
> > > > introducing new ioctl.
> > We are different here. The target of an I/O instruction is the
> > subchannel. CCW devices don't have these kind of registers. The mediated
> > ccw device can not recognize such an submission by its own capbilities. 
> > 
> > A CCW device does not have such registers in both the physical and the
> > mediated devices to sense or recognize the submission request. It's the
> > CPU that recognizes the submission by intercepting the guest ssch
> > instruction.
> > 
> > CPU can not tell if it is issued from a passed thru device driver or a
> > virtio device driver from the guest. So it has to exit to QEMU, and let
> > QEMU take over.
> 
> Hi Dong,
> 
> What actually has triggered the VM_EXIT to QEMU of that vCPU? Is it an MMIO
> access of the "virtual device" inside guest?
Dear Neo,

It's not a MMIO access, but an I/O instruction.

Our cpu has a mode (like vt-x in the x86 world? I guess...) to oversee
the execution of programs in a virtual machine environment. Once the cpu
enters this mode, it commence execution of the guest program. It could
handle many aspects of an virtual machine, or, when for some
instructions if such handling is not provided, cpu will exit from this
mode. The I/O instruction 'ssch' is one kind of the instructions that
this cpu mode could not handle. So a ssch issued from the guest will
trigger the exit of this cpu mode with the exit_reason, and then the
vcpu gets the reason and exit to QEMU.

> 
> Thanks,
> Neo
> 
> > 
> > Once QEMU identifies the target subchannel is serving a passed thru
> > device, it uses the ioctl to pass the instruction parameters into the
> > kernel all the way along the mediated driver to the physical driver to
> > the subchannel to perform the I/O operation.
> > 
> > > > The r/w handler of mediated device can figure
> > > > out whether it's a ccw submission or not. But my understanding might 
> > > > be wrong here.
> > We don't have registers to sense an instruction or operation.
> > 
> > > 
> > > I think we're in violent agreement ;)
> > > 
> > 
> > --------
> > Dong Jia
> > 
> 



--------
Dong Jia

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux