On Wed, 8 Jun 2016 11:18:42 +0800 Dong Jia <bjsdjshi@xxxxxxxxxxxxxxxxxx> wrote: > On Tue, 7 Jun 2016 19:39:21 -0600 > Alex Williamson <alex.williamson@xxxxxxxxxx> wrote: > > > On Wed, 8 Jun 2016 01:18:42 +0000 > > "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote: > > > > > > From: Alex Williamson [mailto:alex.williamson@xxxxxxxxxx] > > > > Sent: Wednesday, June 08, 2016 6:42 AM > > > > > > > > On Tue, 7 Jun 2016 03:03:32 +0000 > > > > "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote: > > > > > > > > > > From: Alex Williamson [mailto:alex.williamson@xxxxxxxxxx] > > > > > > Sent: Tuesday, June 07, 2016 3:31 AM > > > > > > > > > > > > On Mon, 6 Jun 2016 10:44:25 -0700 > > > > > > Neo Jia <cjia@xxxxxxxxxx> wrote: > > > > > > > > > > > > > On Mon, Jun 06, 2016 at 04:29:11PM +0800, Dong Jia wrote: > > > > > > > > On Sun, 5 Jun 2016 23:27:42 -0700 > > > > > > > > Neo Jia <cjia@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > > > 2. VFIO_DEVICE_CCW_CMD_REQUEST > > > > > > > > This intends to handle an intercepted channel I/O instruction. It > > > > > > > > basically need to do the following thing: > > > > > > > > > > > > > > May I ask how and when QEMU knows that he needs to issue such VFIO ioctl at > > > > > > > first place? > > > > > > > > > > > > Yep, this is my question as well. It sounds a bit like there's an > > > > > > emulated device in QEMU that's trying to tell the mediated device when > > > > > > to start an operation when we probably should be passing through > > > > > > whatever i/o operations indicate that status directly to the mediated > > > > > > device. Thanks, > > > > > > > > > > > > Alex > > > > > > > > > > Below is copied from Dong's earlier post which said clear that > > > > > a guest cmd submission will trigger the whole flow: > > > > > > > > > > ---- > > > > > Explanation: > > > > > Q1-Q4: Qemu side process. > > > > > K1-K6: Kernel side process. > > > > > > > > > > Q1. Intercept a ssch instruction. > > > > > Q2. Translate the guest ccw program to a user space ccw program > > > > > (u_ccwchain). > > > > > Q3. Call VFIO_DEVICE_CCW_CMD_REQUEST (u_ccwchain, orb, irb). > > > > > K1. Copy from u_ccwchain to kernel (k_ccwchain). > > > > > K2. Translate the user space ccw program to a kernel space ccw > > > > > program, which becomes runnable for a real device. > > > > > K3. With the necessary information contained in the orb passed in > > > > > by Qemu, issue the k_ccwchain to the device, and wait event q > > > > > for the I/O result. > > > > > K4. Interrupt handler gets the I/O result, and wakes up the wait q. > > > > > K5. CMD_REQUEST ioctl gets the I/O result, and uses the result to > > > > > update the user space irb. > > > > > K6. Copy irb and scsw back to user space. > > > > > Q4. Update the irb for the guest. > > > > > ---- > > > > > > > > Right, but this was the pre-mediated device approach, now we no longer > > > > need step Q2 so we really only need Q1 and therefore Q3 to exist in > > > > QEMU if those are operations that are not visible to the mediated > > > > device; which they very well might be, since it's described as an > > > > instruction rather than an i/o operation. It's not terrible if that's > > > > the case, vfio-pci has its own ioctl for doing a hot reset. > Dear Alex, Kevin and Neo, > > 'ssch' is a privileged I/O instruction, which should be finally issued > to the dedicated subchannel of the physical device. > > BTW, I did remove step Q2 with all of the user-space translation code, > according to your comments in another thread. > > > > > > > > > > > > > > > > My understanding is that such thing belongs to how device is mediated > > > > > (so device driver specific), instead of something to be abstracted in > > > > > VFIO which manages resource but doesn't care how resource is used. > > > > > > > > > > Actually we have same requirement in vGPU case, that a guest driver > > > > > needs submit GPU commands through some MMIO register. vGPU device > > > > > model will intercept the submission request (in its own way), do its > > > > > necessary scan/audit to ensure correctness/security, and then submit > > > > > to physical GPU through vendor specific interface. > > > > > > > > > > No difference with channel I/O here. > > > > > > > > Well, if the GPU command is submitted through an MMIO register, is that > > > > MMIO register part of the mediated device? If so, could the mediated > > > > device recognize the command and do the scan/audit itself? QEMU must > > > > not be the point at which mediation occurs for security purposes, QEMU > > > > is userspace and userspace is not to be trusted. I'm still open to > > > > ioctls where it makes sense, as above, we have PCI specific ioctls and > > > > already, but we need to evaluate each one, why it needs to exist, and > > > > whether we can skip it if the mediated device can trigger the action on > > > > its own. After all, that's why we're using the vfio api, so we can > > > > re-use much of the existing infrastructure, especially for a vGPU that > > > > exposes itself as a PCI device. Thanks, > > > > > > > > > > My point is that a guest submission on vGPU is just a normal trapped > > > register write, which is forwarded from Qemu to VFIO through pwrite > > > interface and then hit mediated vGPU device. The mediated device > > > will recognize this register write as a submission request and then do > > > necessary scan (looks we are saying same thing) and then submit to > > > physical device driver. If loading ccw cmds on channel i/o are also > > > through some I/O registers, it can be implemented same way w/o > > > introducing new ioctl. > We are different here. The target of an I/O instruction is the > subchannel. CCW devices don't have these kind of registers. The mediated > ccw device can not recognize such an submission by its own capbilities. > > A CCW device does not have such registers in both the physical and the > mediated devices to sense or recognize the submission request. It's the > CPU that recognizes the submission by intercepting the guest ssch > instruction. > > CPU can not tell if it is issued from a passed thru device driver or a > virtio device driver from the guest. So it has to exit to QEMU, and let > QEMU take over. > > Once QEMU identifies the target subchannel is serving a passed thru > device, it uses the ioctl to pass the instruction parameters into the > kernel all the way along the mediated driver to the physical driver to > the subchannel to perform the I/O operation. > > > > The r/w handler of mediated device can figure > > > out whether it's a ccw submission or not. But my understanding might > > > be wrong here. > We don't have registers to sense an instruction or operation. Ok, so it seems we need to create some sort of interface to initiate the ccw program, but I suppose I'm not yet convinced that it needs a new ioctl. For instance if you only need to "kick" the device to tell it when to begin translation and execution, we could create a virtual interrupt into the mediated device with an irqfd. QEMU writes to the irqfd (eventfd), the mediated driver receives this kick and begins processing. Another virtual interrupt out to the user might indicate completion. On the other hand if the ioctl was intended to write the ccw program itself to the device, we have vfio device regions that can do this. Simply define within the vfio-ccw API that one of the regions is a virtual program buffer and define the API between the mediated driver and user the sequence of writes that load the program state, initiate the program, and return the result. The vfio API already has a very extensible mechanism for communicating with a device through regions and interrupts, not all of which necessarily need to match physical attributes of the device. ioctls can be added, but lets exhaust the mechanisms we already have through the vfio api first. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html